ArticlePDF Available

Extended Triple Collocation: estimating errors and correlation coefficients with respect to an unknown target

Authors:

Abstract and Figures

Calibration and validation of geophysical measurement systems typically requires knowledge of the “true” value of the target variable. However, the data considered to represent the “true” values often include their own measurement errors, biasing calibration and validation results. Triple collocation (TC) can be used to estimate the root-mean-square-error (RMSE), using observations from three mutually-independent, error-prone measurement systems. Here, we introduce Extended Triple Collocation (ETC): using exactly the same assumptions as TC, we derive an additional performance metric, the correlation coefficient of the measurement system with respect to the unknown target, ρ_(t,X_i ). We demonstrate that ρ_(t,X_i)^2 is the scaled, unbiased signal-to-noise ratio, and provides a complementary perspective compared to the RMSE. We apply it to three collocated wind datasets. Since ETC is as easy to implement as TC, requires no additional assumptions, and provides an extra performance metric, it may be of interest in a wide range of geophysical disciplines.
Content may be subject to copyright.
Extended Triple Collocation: estimating errors and correlation coefficients with respect to
an unknown target
Kaighin A. McColl1, Jur Vogelzang3, Alexandra G. Konings1, Dara Entekhabi1,2, María Piles4,5,
Ad Stoffelen3
Corresponding author: Kaighin A. McColl, Department of Civil and Environmental Engineering,
Massachusetts Institute of Technology, Cambridge, MA 02139 USA (kmccoll@mit.edu)
1Department of Civil and Environmental Engineering, Massachusetts Institute of Technology,
Cambridge, MA 02139 USA
2Department of Earth, Atmospheric and Planetary Sciences, Massachusetts Institute of
Technology, Cambridge, MA 02139 USA
3KNMI, De Bilt, The Netherlands
4Remote Sensing Laboratory, Departament de Teoria del Senyal I Comunicacions, Universitat
Politecnica de Catalunya, Barcelona, Spain
5SMOS Barcelona Expert Center, Barcelona, Spain
Key points
Triple collocation is used to estimate the MSE of measurement system estimates
We extend it to estimate the correlation coefficient
The new approach requires no additional assumptions or computational burden
Abstract
Calibration and validation of geophysical measurement systems typically requires knowledge of
the “true” value of the target variable. However, the data considered to represent the “true”
values often include their own measurement errors, biasing calibration and validation results.
Triple collocation (TC) can be used to estimate the root-mean-square-error (RMSE), using
observations from three mutually-independent, error-prone measurement systems. Here, we
introduce Extended Triple Collocation (ETC): using exactly the same assumptions as TC, we
derive an additional performance metric, the correlation coefficient of the measurement system
with respect to the unknown target, . We demonstrate that 
is the scaled, unbiased
signal-to-noise ratio, and provides a complementary perspective compared to the RMSE. We
apply it to three collocated wind datasets. Since ETC is as easy to implement as TC, requires no
additional assumptions, and provides an extra performance metric, it may be of interest in a wide
range of geophysical disciplines.
Index terms: Remote sensing (1855); Remote sensing (3360); Remote sensing and
electromagnetic processes (4275); Estimating and forecasting (1816); Model calibration (1846)
Keywords: triple collocation; signal-to-noise ratio; model validation; model calibration;
correlation coefficient
1. Introduction
Geophysical measurement systems, such as in-situ sensor networks, satellites and models,
require calibration and validation. This requires comparison of their measurements with true
observations of the target variable. A range of performance metrics exist to summarize this
comparison, including the root-mean-square-error (RMSE) and correlation coefficient. No single
metric can capture all relevant characteristics of the relationship between the measurement
system and the target, which may include, but are not limited to, the measurement system’s bias,
noise and sensitivity with respect to the target variable (Entekhabi et al., 2010).
In practice, however, the data considered to represent the “true” observations are themselves
imperfect due to their own measurement errors and differences in support scale. Triple
collocation (TC) is a technique for estimating the unknown error standard deviations (or RMSEs)
of three mutually-independent measurement systems, without treating any one system as
perfectly-observed “truth” (Stoffelen, 1998). It assumes a linear error model where errors are
uncorrelated with each other and the target variable. TC has been used widely in oceanography
to estimate errors in measurements of sea surface temperature (Gentemann, 2014; O’Carroll et
al., 2008), wind speed and stress (Portabella and Stoffelen, 2009; Stoffelen, 1998; Vogelzang et
al., 2011), and wave height (Caires and Sterl, 2003; Janssen et al., 2007). It has also been used in
hydrology to estimate errors in measurements of precipitation (Roebeling et al., 2012), fraction
of absorbed photosynthetically active radiation (D’Odorico et al., 2014), leaf area index (Fang et
al., 2012), and, particularly, soil moisture (Anderson et al., 2012; Dorigo et al., 2010; Draper et
al., 2013; Hain et al., 2011; Miralles et al., 2010; Parinussa et al., 2011; Scipal et al., 2008; Su et
al., 2014). It has been applied in data assimilation (Crow and van den Berg, 2010), and can also
be used to optimally rescale two measurement systems to a third reference system (Stoffelen,
1998; Yilmaz and Crow, 2013).
While TC is a powerful approach for estimating one metric of measurement system performance
(RMSE), a suite of metrics are needed for calibration and validation. In this paper, we extend TC
to also estimate the correlation coefficient of each measurement system with respect to the
unknown target variable. We call this approach Extended Triple Collocation (ETC). ETC is
simple to implement and adds no additional assumptions or computational cost to TC. In section
2, we review TC and introduce ETC, deriving an equation for the correlation coefficient from the
assumptions of TC alone. We show that the correlation coefficients provide important insights
into the fidelity of the measurement systems to the target variable beyond those provided by the
RMSE, combining information on the measurement system’s sensitivity and noise with
information on the strength of the target signal. In section 3, we present a collocated dataset of
ocean surface wind measurements from buoys, satellite scatterometers and a Numerical Weather
Prediction (NWP) forecast model and apply ETC to it in section 4.
2. Methods
2.1 Classical triple collocation
In this section, we review the derivation of the TC estimation equations. We begin with an affine
error model relating measurements to a (geophysical) variable, a standard form used in the triple
collocation literature (Zwieback et al., 2012):
 (1)
where the () are collocated measurement systems linearly related to the true
underlying value with additive random errors , respectively. They could represent, for
instance, outputs from a model, a remotely sensed product, and point measurements from in-situ
stations. and are all random variables. and are the ordinary least-squares (OLS)
intercepts and slopes, respectively.
The covariances between the different measurement systems are given by
 (2)
where . We assume that the errors from the independent sources have zero mean
(), are uncorrelated with each other () and with t ().
Using these assumptions, the two middle terms on the right hand side are zero, and so is the last
when , so the equation reduces to

 (3)
where . Since there are six unique terms in the covariance matrix
(), we have six equations but seven unknowns
(); therefore, the system is underdetermined and there is no unique
solution. However, if we forego solving for and , and instead define a new variable
, we can write

 (4)
We now have six equations and six unknowns, and can solve the system. We obtain the TC
estimation equation for RMSE,






(5)
We may also solve for , but this is not typically done in TC. We will show in the next section
that contains useful information that forms the basis for ETC. Within the soil moisture
community, triple collocation is often applied by calculating the covariances of two differences
between products (e.g., Scipal et al., 2008). This is equivalent to deriving the parameters from
the measurement system covariance equations as discussed here.
In practice, “representativeness errors” may exist due to differences in support scale between
measurement systems, causing subtle cross-correlations between the errors such that

. This introduces additional unknowns into the problem, rendering it
underdetermined. To avoid this, the representativeness error has been ignored in many studies
that use TC, often without justification. However, if an estimate for 
exists, it can be easily
subtracted from . For wind measurements, 
can be estimated using assumptions about the
wind spectra (Stoffelen, 1998; Vogelzang et al., 2011), but little is known about the
representativeness error for other target variables.
If we are willing to treat one of the measurement systems as a reference with known calibration
(i.e., known and ), we can reduce the number of unknowns and solve for the remaining
unknowns without introducing . Without loss of generality, assume is the reference system
and has been perfectly calibrated to so that and . Then we have

 

(6)
where the overbars denote sample means. The system is often solved iteratively, incorporating an
outlier detection and removal process. This is very important since covariance matrix estimates
are highly sensitive to outliers. In many studies, the measurement systems are rescaled before
applying TC, and it is presumed that and , simplifying the
TC estimation equation to


 (7)
Note, however, that this approach should be used with caution. Naively rescaling the
measurement systems (e.g., by matching their temporal variances) and applying this simplified
estimation equation will deliver biased RMSE estimates, since error estimation and calibration
are fundamentally intertwined (Stoffelen, 1998). In practice, consistent estimates of calibration
parameters and error estimates can be obtained by solving the equations iteratively (see
Vogelzang and Stoffelen (2012) for more details), since triple collocation achieves the optimal
rescaling (Yilmaz and Crow, 2013). In this study we calculate RMSEs using equation 5 rather
than rescaling and using equation 7.
2.2 Extended triple collocation
In this section, we show that can be used to solve for the correlation coefficients of the
measurement systems with respect to the unknown truth. We demonstrate that the correlation
coefficient contains useful information beyond that provided by the RMSE. Recall that for OLS,

(8)
where  is the correlation coefficient between and . Note that this relation can also be
obtained directly from (1) using the standard definitions of correlation and covariance. We
emphasize that the independent variable is the true underlying value and not subject to
measurement error, so the OLS framework is valid. If there are errors in the measurement of
that are not captured by the error model (1), then the OLS slope will be biased and a new error
model will be required (Cornbleet and Gochman, 1979; Deming, 1943). Overcoming these
biases was, in fact, the original motivation for the development of triple collocation, rather than
the estimation of RMSEs (Stoffelen, 1998).
The key insight of ETC is that, from (8), we obtain . Since  is already
estimated from the data, and since we can solve for using (4), we can solve for . We
obtain the new ETC estimation equation







(9)
where the  are correct up to a sign ambiguity. In practice, the measurement systems will
almost always be expected to be positively correlated to the unobserved truth.
The correlation coefficients provide important new information about the performance of the
measurement systems. For the given error model (1), it can be shown that



(10)
where we define 

to be the unbiased signal-to-noise-ratio (in contrast,
the standard signal-to-noise ratio is 
). The squared correlation coefficient,
therefore, is the unbiased signal-to-noise ratio, scaled between 0 and 1. It combines information
about (i) the sensitivity of the measurement system (ii) the variability of the signal and (iii)
the variability of the measurement error . In contrast, standard triple collocation only directly
estimates (iii). Note that, while TC returns an estimate for (i), this is estimated with respect to a
reference measurement system. Its intended purpose is for calibration against that reference
measurement system, not as an estimate of the true system sensitivity. Therefore, 
contains
useful additional information relevant to measurement system validation that is not included in
. This is clear from the fact that, for a fixed MSE, 
may take any value between 0 and 1,
its full range. This makes intuitive sense: a given noise level may be too high for a low-
sensitivity system measuring a weak signal, but acceptable for a high-sensitivity system
measuring a strong signal.
We note that 
is closely related to the  metric defined in Draper et al. (2013) as

. They are both measures of relative similarity that isolate phase differences
between two signals. Furthermore, it is apparent that 



from equations (3) and (10), and therefore yields identical performance rankings compared to
those obtained using 
. However, in contrast to the fRMSE, the correlation coefficient has
been used in many validation studies spanning several decades (e.g., Jackson et al. (2012); Mo et
al. (1982); Owe et al. (1992)).
3. Wind data
In this section, we describe the buoy, NWP and scatterometer wind products used in this study as
a case study for ETC. TC was originally designed for application to wind velocities (Stoffelen,
1998), and this target variable more closely matches the assumptions of TC compared to other
variables such as soil moisture (Yilmaz and Crow, 2014). Unlike other target variables,
reasonable estimates of the representativeness error also exist (Stoffelen, 1998; Vogelzang et al.,
2011). We use the same collocated triplets as in Vogelzang et al. (2011) and refer the reader to
this study for more detail on the data used; for completeness, we give a brief description here.
Three different scatterometer products are used. Wind retrievals from EUMETSAT’s C-band
Advanced SCATterometer (ASCAT) are processed to generate two different products: the 12.5-
km resolution ASCAT-12.5 product, and the 25-km resolution ASCAT-25 product. Retrievals
from the SeaWinds sensor on board QuickSCAT are processed to generate the 25-km resolution
SeaWinds-KNMI product. Vogelzang et al. (2011) consider a fourth product, SeaWinds-NOAA,
processed by the National Oceanic and Atmospheric Administration. This product exhibited
anomalous behavior compared to the others and is omitted from this study. Table 1 gives further
details on the scatterometer products used, including their grid size, representativeness errors and
number of observations available that were also collocated with a buoy and NWP measurement.
The very large sample sizes (much larger than the recommended value of ~500 given by
Zwieback et al. (2012)) ensure precise ETC estimates.
Quality-controlled buoy data are taken from the European Center for Medium-range Weather
Forecasting (ECMWF) Meteorological Archival and Retrieval System. The NWP forecasts are
also obtained from the ECMWF. Collocated buoy-scatterometer-NWP triplets are obtained for
the period November 1, 2007 November 30, 2009, except for those including the ASCAT-12.5
product, where the period is October 1, 2008 November 30, 2009. The study domain is largely
restricted to the tropics and the coasts of Europe and North America, due to a lack of reliable
buoy observations outside these regions. The data are plotted in Figure 1. Note that for each
dataset, the marginal distributions are approximately Gaussian, although Gaussian data are not
required for TC or ETC (indeed, TC has frequently been applied to non-Gaussian data such as
soil moisture). Gaussianity does, however, ensure that the RMSE is well-defined and assists in
interpretation. The block correlations in the SeaWinds-KNMI and buoy data are due to binning
in those datasets.
<insert Table 1 around here>
<insert Figure 1 around here>
We use the ASCAT Wind Data Processor (AWDP) triple collocation scheme described in
Vogelzang and Stoffelen (2012, available at
http://research.metoffice.gov.uk/research/interproj/nwpsaf/scatterometer/TripleCollocation_NW
PSAF_TR_KN_021_v1_0.pdf), updated to also calculate correlation coefficients. The scheme
solves iteratively for the RMSEs and correlation coefficients and includes quality-control and
outlier detection and removal steps. We subtract out representativeness errors (Table 1)
calculated in (Vogelzang et al., 2011) and estimate 95% confidence intervals using bootstrapping
(Efron and Tibshirani, 1994) with N = 100 replicates. We perform the analysis with buoy-
scatterometer-NWP forecast triplets three separate times, using a different scatterometer product
each time. In all analyses, the buoy data are used as the reference dataset, for consistency with
Vogelzang et al. (2011). However, the choice of reference system only impacts the estimates of
and (not shown), not our estimates of and 
.
4. Results and Discussion
<insert Figure 2 around here>
Figure 2 shows the ETC estimates of u, v RMSE and correlation coefficient for the buoy, NWP
and various scatterometer products. The RMSE estimates are all low and the correlation
coefficients are all high. They are consistent with reasonable guesses for and . As an
example, consider the ETC estimates of scatterometer u RMSE  and
correlation coefficient , estimated using ASCAT-12.5 scatterometer data (we
use the mean of the bootstrapped replicates here). Substituting into (10), and assuming ,
we obtain . While the true value of is unknown, this estimate appears very reasonable
given the marginal distribution of u in Fig. 1a).
The results demonstrate the importance of using a validation metric that combines measures of
noise and sensitivity, rather than noise alone. Focusing on the scatterometer ETC estimates, we
see that, for u, the highest correlation coefficients correspond to the lowest RMSEs and vice
versa. Since does not vary between scatterometer products, this suggests that differences in
noise dominate differences in sensitivity between products. For v, however, this is not the case:
ASCAT-12.5 has the highest RMSE but does not have the lowest correlation coefficient. This
suggests that, while ASCAT-12.5 estimates of v are noisier than those of ASCAT-25, ASCAT-
12.5 has a greater  because it is more sensitive to the signal, v, although it may also be an
artifact caused by incorrect assumptions in the error model (1). In this case study, the differences
in noise and sensitivity between products are relatively small. However, it is easy to imagine
scenarios where validating multiple satellite products on the basis of RMSE alone, compared to a
combination of RMSE and correlation coefficient, could yield very different interpretations of
their relative performances.
ETC builds on TC, but also inherits its weaknesses. Using different scatterometer products, we
would expect the ETC estimates of buoy RMSEs and correlation coefficients to remain identical;
similarly, for the NWP estimates. While the differences are small, they are too large to be
explained by sampling error (particularly for the NWP estimates), and are likely due to subtle
violations of the error model’s assumptions or inaccurate corrections for representativeness
errors. If the error model given in (1) is not valid, the estimates of RMSE and correlation
coefficient will be biased. The results are particularly sensitive to the assumption of independent
errors between buoy, scatterometer and NWP estimates. However, these are all pre-existing
weaknesses in TC and not unique to ETC. ETC uses exactly the same assumptions as TC.
4. Conclusions
Triple collocation is a powerful and popular technique for calibrating and validating
measurement system estimates of geophysical target variables. In this paper, we introduced ETC:
using exactly the same error model and assumptions as TC, we derived the correlation
coefficient of each measurement system with respect to the unknown target variable. We
demonstrated that ETC’s correlation coefficient provides useful insights into the correspondence
between the measurement system estimates and the target variable, beyond those provided by
TC’s RMSE estimate. By integrating information on the measurement system’s sensitivity to the
target variable, measurement noise and the variability of the target variable itself, the correlation
coefficient provides a complementary (and sometimes, very different) perspective to that of the
RMSE when validating measurement systems. In particular, the measurement noise (estimated
by the RMSE) is much more informative when interpreted relative to the observed signal: for
instance, a small amount of measurement noise, in absolute terms, may still be of concern if the
measurement system is relatively insensitive to the target variable, and/or the target signal is
weak (Entekhabi et al., 2010). Since ETC uses exactly the same assumptions as TC, it appears
that it may also facilitate the estimation of correlation coefficients in recent generalizations of TC
from measurement systems to (Zwieback et al., 2012) and, in cases where the target
variable has sufficient temporal autocorrelation, (Su et al., 2014). Finally, since ETC is as
easy to implement as TC, requires no additional assumptions, and provides estimates of two
complementary performance metrics instead of one, we suggest it may be of interest to
practitioners in a wide range of geophysical disciplines.
Acknowledgements
The authors thank Chun-Hsu Su (University of Melbourne) and Marcos Portabella (ICM-CSIC)
for useful discussions. We also thank an anonymous reviewer and Wade Crow (USDA) for
constructive and thorough reviews. The data used in this paper are available on request from the
corresponding author.
References
Anderson, W.B., Zaitchik, B.F., Hain, C.R., Anderson, M.C., Yilmaz, M.T., Mecikalski, J.,
Schultz, L., 2012. Towards an integrated soil moisture drought monitor for East Africa.
Hydrol Earth Syst Sci 16, 28932913. doi:10.5194/hess-16-2893-2012
Caires, S., Sterl, A., 2003. Validation of ocean wind and wave data using triple collocation. J.
Geophys. Res. Oceans 108, 3098. doi:10.1029/2002JC001491
Cornbleet, P.J., Gochman, N., 1979. Incorrect least-squares regression coefficients in method-
comparison analysis. Clin. Chem. 25, 432438.
Crow, W.T., van den Berg, M.J., 2010. An improved approach for estimating observation and
model error parameters in soil moisture data assimilation. Water Resour. Res. 46,
W12519. doi:10.1029/2010WR009402
D’Odorico, P., Gonsamo, A., Pinty, B., Gobron, N., Coops, N., Mendez, E., Schaepman, M.E.,
2014. Intercomparison of fraction of absorbed photosynthetically active radiation
products derived from satellite data over Europe. Remote Sens. Environ. 142, 141154.
doi:10.1016/j.rse.2013.12.005
Deming, W.E., 1943. Statistical adjustment of data. Wiley, New York.
Dorigo, W.A., Scipal, K., Parinussa, R.M., Liu, Y.Y., Wagner, W., de Jeu, R.A.M., Naeimi, V.,
2010. Error characterisation of global active and passive microwave soil moisture data
sets. Hydrol Earth Syst Sci Discuss 7, 56215645. doi:10.5194/hessd-7-5621-2010
Draper, C., Reichle, R., de Jeu, R., Naeimi, V., Parinussa, R., Wagner, W., 2013. Estimating root
mean square errors in remotely sensed soil moisture over continental scale domains.
Remote Sens. Environ. 137, 288298. doi:10.1016/j.rse.2013.06.013
Efron, B., Tibshirani, R.J., 1994. An Introduction to the Bootstrap. CRC Press.
Entekhabi, D., Reichle, R.H., Koster, R.D., Crow, W.T., 2010. Performance Metrics for Soil
Moisture Retrievals and Application Requirements. J. Hydrometeorol. 11, 832840.
doi:10.1175/2010JHM1223.1
Fang, H., Wei, S., Jiang, C., Scipal, K., 2012. Theoretical uncertainty analysis of global MODIS,
CYCLOPES, and GLOBCARBON LAI products using a triple collocation method.
Remote Sens. Environ. 124, 610621. doi:10.1016/j.rse.2012.06.013
Gentemann, C.L., 2014. Three way validation of MODIS and AMSR-E sea surface temperatures.
J. Geophys. Res. Oceans 119, 25832598. doi:10.1002/2013JC009716
Hain, C.R., Crow, W.T., Mecikalski, J.R., Anderson, M.C., Holmes, T., 2011. An
intercomparison of available soil moisture estimates from thermal infrared and passive
microwave remote sensing and land surface modeling. J. Geophys. Res. Atmospheres
116, D15107. doi:10.1029/2011JD015633
Jackson, T.J., Bindlish, R., Cosh, M.H., Zhao, T., Starks, P.J., Bosch, D.D., Seyfried, M., Moran,
M.S., Goodrich, D.C., Kerr, Y.H., Leroux, D., 2012. Validation of Soil Moisture and
Ocean Salinity (SMOS) Soil Moisture Over Watershed Networks in the U.S. IEEE Trans.
Geosci. Remote Sens. 50, 15301543. doi:10.1109/TGRS.2011.2168533
Janssen, P.A.E.M., Abdalla, S., Hersbach, H., Bidlot, J.-R., 2007. Error Estimation of Buoy,
Satellite, and Model Wave Height Data. J. Atmospheric Ocean. Technol. 24, 16651677.
doi:10.1175/JTECH2069.1
Miralles, D.G., Crow, W.T., Cosh, M.H., 2010. Estimating Spatial Sampling Errors in Coarse-
Scale Soil Moisture Estimates Derived from Point-Scale Observations. J. Hydrometeorol.
11, 14231429. doi:10.1175/2010JHM1285.1
Mo, T., Choudhury, B.J., Schmugge, T.J., Wang, J.R., Jackson, T.J., 1982. A model for
microwave emission from vegetation-covered fields. J. Geophys. Res. Oceans 87, 11229
11237. doi:10.1029/JC087iC13p11229
O’Carroll, A.G., Eyre, J.R., Saunders, R.W., 2008. Three-Way Error Analysis between AATSR,
AMSR-E, and In Situ Sea Surface Temperature Observations. J. Atmospheric Ocean.
Technol. 25, 11971207. doi:10.1175/2007JTECHO542.1
Owe, M., van de Griend, A.A., Chang, A.T.C., 1992. Surface moisture and satellite microwave
observations in semiarid southern Africa. Water Resour. Res. 28, 829839.
doi:10.1029/91WR02765
Parinussa, R.M., Holmes, T.R.H., Yilmaz, M.T., Crow, W.T., 2011. The impact of land surface
temperature on soil moisture anomaly detection from passive microwave observations.
Hydrol. Earth Syst. Sci. 15, 31353151. doi:10.5194/hess-15-3135-2011
Portabella, M., Stoffelen, A., 2009. On Scatterometer Ocean Stress. J. Atmospheric Ocean.
Technol. 26, 368382. doi:10.1175/2008JTECHO578.1
Roebeling, R.A., Wolters, E.L.A., Meirink, J.F., Leijnse, H., 2012. Triple Collocation of
Summer Precipitation Retrievals from SEVIRI over Europe with Gridded Rain Gauge
and Weather Radar Data. J. Hydrometeorol. 13, 15521566. doi:10.1175/JHM-D-11-
089.1
Scipal, K., Holmes, T., de Jeu, R., Naeimi, V., Wagner, W., 2008. A possible solution for the
problem of estimating the error structure of global soil moisture data sets. Geophys. Res.
Lett. 35. doi:10.1029/2008GL035599
Stoffelen, A., 1998. Toward the true near-surface wind speed: Error modeling and calibration
using triple collocation. J. Geophys. Res. 103, 77557766.
Su, C.-H., Ryu, D., Crow, W.T., Western, A.W., 2014. Beyond triple collocation: Applications
to soil moisture monitoring. J. Geophys. Res. Atmospheres 2013JD021043.
doi:10.1002/2013JD021043
Vogelzang, J., Stoffelen, A., 2012. Triple collocation. EUMETSAT Report. Available at
http://research.metoffice.gov.uk/research/interproj/nwpsaf/scatterometer/TripleCollocatio
n_NWPSAF_TR_KN_021_v1_0.pdf
Vogelzang, J., Stoffelen, A., Verhoef, A., Figa-Saldaña, J., 2011. On the quality of high-
resolution scatterometer winds. J. Geophys. Res. 116. doi:10.1029/2010JC006640
Yilmaz, M.T., Crow, W.T., 2013. The Optimality of Potential Rescaling Approaches in Land
Data Assimilation. J. Hydrometeorol. 14, 650660. doi:10.1175/JHM-D-12-052.1
Yilmaz, M.T., Crow, W.T., 2014. Evaluation of Assumptions in Soil Moisture Triple Collocation
Analysis. J. Hydrometeorol. doi:10.1175/JHM-D-13-0158.1
Zwieback, S., Scipal, K., Dorigo, W., Wagner, W., 2012. Structural and statistical properties of
the collocation technique for error characterization. Nonlinear Process. Geophys. 19, 69
80. doi:10.5194/npg-19-69-2012
Tables
Table 1: Scatterometer productsa
Product
Grid size (km)


N
ASCAT-12.5
12.5
0.63
1.00
32,317
ASCAT-25
25
0.49
0.69
54,187
SeaWinds-KNMI
25
1.28
0.44
76,947
aThe scatterometer products and values used are identical to those used in Vogelzang et al.
(2011). N is the number of collocated triplets available for each product. and are the
estimated representativeness errors in the u and v wind component measurements, respectively.
Figures
Figure 1. Scatter plots and kernel-density-estimated marginal distributions (on the same axes) for
the wind data used in this study, where u is the zonal wind velocity and v is the meridional wind
velocity. Plots for scatterometer products correspond to a) ASCAT-12.5 b) ASCAT-25 c)
SeaWinds-KNMI. Plots for d) buoys and e) NWP products are also shown. The marginal
distributions are all approximately normal for all products used.
Figure 2: (Rows 1 and 3): Triple collocation estimates of the RMSEs for u () and v ()
for the buoy, scatterometer and NWP products, respectively, calculated using equation 5. The
analysis is performed with buoy-scatterometer-NWP forecast triplets three separate times, using
a different scatterometer product each time (a) ASCAT-12.5 b) ASCAT-25 c) SeaWinds-
KNMI). (Rows 2 and 4): Extended triple collocation estimates of the correlation coefficient for u
() and v () for the buoy, scatterometer and NWP products, respectively, calculated
using equation 9. Bootstrap estimates (N = 100 replicates) of the 95% confidence intervals are
shown for each estimate. The bootstrapped sample means of  and  are identical to the
values given in Table 4 of Vogelzang et al. (2011).
... There are numerous precipitation datasets from various sensors and models and these could be used to calculate precipitation data error at a large spatial scale (Xu et al., 2020b;Sun et al., 2018). Three-cornered hat (TCH) and triple collocation (TC) are two commonly used methods to evaluate the random error among multisource datasets, which do not require ground measurements as references (Premoli and Tavella, 1993;Mccoll et al., 2014;Stoffelen, 1998). The ba-sic assumption of the TCH and TC methods is the stationarity of both the raw dataset and its error, which may not always be satisfied for real-world data. ...
... The collected three datasets are all reanalysis data, which are generated from different physical models and data assimilation algorithms. The different reanalysis datasets and their errors are generally not closely correlated and are regarded as collocated datasets for the uncertainty estimation, similar to existing studies (Xu et al., 2021b;Mccoll et al., 2014;Gruber et al., 2017). In the TCH algorithm, one arbitrary dataset is chosen as the reference among the three datasets, and then the differencing operation is conducted between the reference and the other two datasets to get the differencing series. ...
... The Monte Carlo method is a kind of stochastic simulation technology, proposed by Stanislaw Ulam and John von Neumann during the Second World War (Von Neumann and Ulam, 1951). Monte Carlo methods are used to estimate unknown parameters by random sampling and are widely applied in mathematics, physics, game theory and finance (Brooks, 1998;Jacoboni and Lugli, 2012;Metropolis and Ulam, 1949;Rubinstein and Kroese, 2016). In Eq. (14), the posterior distribution p(w|X, Y ) cannot be solved analytically. ...
Article
Full-text available
Precipitation forecasting is an important mission in weather science. In recent years, data-driven precipitation forecasting techniques could complement numerical prediction, such as precipitation nowcasting, monthly precipitation projection and extreme precipitation event identification. In data-driven precipitation forecasting, the predictive uncertainty arises mainly from data and model uncertainties. Current deep learning forecasting methods could model the parametric uncertainty by random sampling from the parameters. However, the data uncertainty is usually ignored in the forecasting process and the derivation of predictive uncertainty is incomplete. In this study, the input data uncertainty, target data uncertainty and model uncertainty are jointly modeled in a deep learning precipitation forecasting framework to estimate the predictive uncertainty. Specifically, the data uncertainty is estimated a priori and the input uncertainty is propagated forward through model weights according to the law of error propagation. The model uncertainty is considered by sampling from the parameters and is coupled with input and target data uncertainties in the objective function during the training process. Finally, the predictive uncertainty is produced by propagating the input uncertainty in the testing process. The experimental results indicate that the proposed joint uncertainty modeling framework for precipitation forecasting exhibits better forecasting accuracy (improving RMSE by 1 %–2 % and R2 by 1 %–7 % on average) relative to several existing methods, and could reduce the predictive uncertainty by ∼28 % relative to the approach of Loquercio et al. (2020). The incorporation of data uncertainty in the objective function changes the distributions of model weights of the forecasting model and the proposed method can slightly smooth the model weights, leading to the reduction of predictive uncertainty relative to the method of Loquercio et al. (2020). The predictive accuracy is improved in the proposed method by incorporating the target data uncertainty and reducing the forecasting error of extreme precipitation. The developed joint uncertainty modeling method can be regarded as a general uncertainty modeling approach to estimate predictive uncertainty from data and model in forecasting applications.
... To address this limitation, collocation analysis methods have emerged as promising techniques to estimate the random error standard deviation and data-truth correlations for collocated datasets on a grid-cell-by-grid-cell basis for the entire globe (Dong et al., 2020c;Gruber et al., 2020;McColl et al., 2016;Stoffelen, 1998). These methods regard the errors associated with collocated datasets as a true representation of uncertainty, without assuming that any of the datasets are free of errors (McColl et al., 2014). The main advantage of collocation analysis is that no high-quality reference dataset is required (Su et al., 2014;Wu et al., 2021). ...
... This model is referred to as the additive error structure model. (McColl et al., 2014) introduced the extended collocation algorithm and illustrated that the additive error model tends to overestimate the uncertainty, which was further validated by (Li et al., 2018). As an improvement, a multiplicative error model is often used in practice with the equation described as follows: ...
Article
Full-text available
Evapotranspiration (ET) is one of the key elements linking Earth’s water-carbon system. Accurate estimation of global land evapotranspiration is essential for understanding land-atmosphere interactions under a changing climate. However, due to a lack of observations at the global scale, inherent uncertainties limit the direct use of these data. In this study, we employed collocation analysis methods, including single and double instrumental variable algorithms (IVS/IVD), triple collocation (TC), quadruple collocation (QC) and extended double instrumental variable algorithms (EIVD) to evaluate five widely used ET products at 0.1° and 0.25° resolutions over daily and 8-day frequencies. To validate the reliability of collocation methods, the collocation analysis results were compared with evaluations based on in-situ observations. The results exhibited reasonably high accuracy with an average correlation of determination of 0.71 for all methods. In addition, IVD, EIVD and QC demonstrated better performances than other methods. In general, the ERA5 and GLEAM products showed lower uncertainty than the other products over 0.1° and 0.25°, respectively. Although the error resulting from nonzero error cross-correlation (ECC) should be considered, the ECC results from EIVD and QC revealed that this influence was acceptable in our study. Overall, this study presented a comprehensive application and comparison of all collocation analysis methods for error characterization of ET products. The findings suggested that collocation analysis methods could be reliable tools to serve as alternatives for tower observations at the global scale, which could be helpful for further data assimilation and merging
... A multi-sensor data fusion (Gao et al., 2011) based on the observation noises of each data is an efficient way to make the data more accurate in high wind regimes, with observation errors assumed to be independent of each other (McColl et al., 2014;Saha et al., 2020). The weighting factors are inversely proportional to their observation error variances for the sensors. ...
... where σ t 2 is the variance of the true value. For a set of three collocated datasets, following McColl et al. (2014) and using six unique terms of a 3 × 3 covariance matrix (Q 11 , Q 12 , Q 13 , Q 22 , Q 23 , and Q 33 ), equation (7) can be solved to obtain the root mean square error (RMSE), as follows: ...
Article
Full-text available
Improving forecasts of storms and hurricanes and their potential impacts is highly important to public safety, economic security, commerce, and community infrastructure. One key element of forecast improvement is more accurate and increased spatial–time coverage of observational data for model calibration, quality control and initialization, and/or data assimilation. The National Oceanic and Atmospheric Administration (NOAA) has been producing a global gridded 0.25° and 6-hourly sea surface winds product that has wide applications in marine transportation, marine ecosystem and fisheries, offshore winds, weather and ocean forecasts, and other areas. The NOAA National Centers for Environmental Information (NCEI) Blended Sea winds (NBS) v1.0 product is generated by blending observations from multiple sources (satellites), including scatterometers and microwave radiometers/imagers. However, these sensors do not provide accurate observations of intensive high-speed hurricane winds because their signals saturate in very high winds or degrade in the presence of rain. Recent advancements in satellite wind retrievals revealed that the L-band (1.42 GHz) instrument on the Soil Moisture Active Passive (SMAP) satellite and the AMSR2 All-Weather channel (~6.9 GHz) can provide accurate hurricane winds of up to 65 m/s (145 MPH) without being affected by rain; these data are incorporated in a new version of the Blended Sea Winds, NBS v2.0, using a multi-sensor data fusion technique based on random errors, enabling it to resolve very high winds, especially along the eyewalls of tropical cyclones and hurricanes. NBS v2.0 provides both a long-term record of 30+ years retrospectively since July 1987 and a near-real-time mode with 1-day latency.
... This approach was originally developed by Stoffelen (1998) and has been widely used in soil moisture evaluation (Dorigo et al. 2010;Miralles et al. 2010;Yilmaz et al. 2012). The TC approach was later extended by McColl et al. (2014) and optimized by Alemohammad et al. (2015), termed the multiplicative TC (MTC). Compared to the traditional TC approach, MTC is more appropriate for quantifying errors in precipitation estimates and has been shown effective in precipitation evaluation (Li et al. 2018;Tang et al. , 2020. ...
... Assuming that there are three collocated estimates of precipitation and errors in each product are independent with each other and with the true precipitation, the root-meansquare error (RMSE) of each product can be estimated using the following equations (McColl et al. 2014): ...
Article
Satellite-based and reanalysis precipitation estimates are an alternative and important supplement to rain gauge data. However, performance of China’s Fengyun (FY) satellite precipitation product and how it compares with other mainstream satellite and reanalysis precipitation products over China remain largely unknown. Here five satellite-based precipitation products (i.e., FY-2 precipitation product, IMERG, GSMaP, CMORPH, and PERSIANN-CDR) and one reanalysis product (i.e., ERA5) are intercompared and evaluated based on in situ daily precipitation measurements over mainland China during 2007–17. Results show that the performance of these precipitation products varies with regions and seasons, with better statistical metrics over wet regions and during warm seasons. The infrared–microwave combined precipitation [i.e., IMERG, GSMaP, and CMORPH, with median KGE (Kling–Gupta efficiency) values of 0.53, 0.52, 0.59, respectively] reveals better performance than the infrared-based only product (i.e., PERSIANN-CDR, with a median KGE of 0.31) and the reanalysis product (i.e., ERA5, with a median KGE of 0.43). IMERG performs well in retrieving precipitation intensity and occurrence over China, while GSMaP performs well in the middle to low reaches of the Yangtze River basin but poorly over sparsely gauged regions, e.g., Xinjiang in northwest China and the Tibetan Plateau. CMORPH performs well over most regions and has a greater ability to detect precipitation events than GSMaP. The FY-2 precipitation product can capture the overall spatial distribution of precipitation in terms of both precipitation intensity and occurrence (median KGE and CSI of 0.54 and 0.55), and shows better performance than other satellite precipitation products in winter and over sparsely gauged regions. Annual precipitation from different products is generally consistent, though underestimation exists in the FY-2 precipitation product during 2015–17.
... For example, TC method is used to evaluate the global errors for ASCAT, AMSR-E, and ERA reanalysis SSM [54], which has shown that TC method is robust and can generate objective error estimates. There are four assumptions of TC method for temporal r estimation in the case of unknown truth values [55]: (1) there is a linear correlation between the three kinds of SSM and the unknown truth SSM; (2) the error is stable and does not change with temporal variation; (3) the errors of the three kinds of SSM are independent of each other; (4) the errors of the three kinds of SSM are independent of unknown truth values. As fused MODIS SSM and AMSR-E SSM are related to each other, a triplet pattern of in situ, Noah, and remote sensing SSM is built for the TC evaluation in the study. ...
... Referring to previous studies [45,55], the valid number of data points should be greater than 100 for each pixel SSM in the TC triplet. Like the temporal evaluation against in situ data, the TC evaluations are still carried out at the selected 29 in situ sites (the black triangle shown in Figure 1b). ...
Article
Full-text available
The coarse scale of passive microwave surface soil moisture (SSM) is not suitable for regional agricultural and hydrological applications such as drought monitoring and irrigation management. The optical/thermal infrared (OTI) data-based passive microwave SSM downscaling method can effectively improve its spatial resolution to fine scale for regional applications. However, the estimation capability of SSM with long time series is limited by OTI data, which are heavily polluted by clouds. To reduce the dependence of the method on OTI data, an SSM retrieval and spatio-temporal fusion model (SMRFM) is proposed in the study. Specifically, a model coupling in situ data, MODerate-resolution Imaging Spectro-radiometer (MODIS) OTI data, and topographic information is developed to retrieve MODIS SSM (1 km) using the least squares method. Then the retrieved MODIS SSM and the spatio-temporal fusion model are employed to downscale the passive microwave SSM from coarse scale to 1 km. The proposed SMRFM is implemented in a grassland dominated area over Naqu, central Tibet Plateau, for Advanced Microwave Scanning Radiometer—Earth Observing System sensor (AMSR-E) SSM downscaling in unfrozen period. The in situ SSM and Noah land surface model 0.01° SSM are used to validate the estimated MODIS SSM with long time series. The evaluations show that the estimated MODIS SSM has the same temporal resolution with AMSR-E and obtains significantly improved detailed spatial information. Moreover, the temporal accuracy of estimated MODIS SSM against in situ data (r = 0.673, μbRMSE = 0.070 m3/m3) is better than the AMSR-E (r = 0.661, μbRMSE = 0.111 m3/m3). In addition, the temporal r of estimated MODIS SSM is obviously higher than that of Noah data. Therefore, this suggests that the SMRFM can be used to estimate MODIS SSM with long time series by AMSR-E SSM downscaling in the study. Overall, the study can provide help for the development and application of microwave SSM-related scientific research at the regional scale.
... Roebeling et al. [22] first used the TC approach to assess the errors of precipitation estimation products. McColl et al. [23] extended the assessment metrics of the TC approach from the root mean square error (RMSE) to the correlation coefficient (CC) between the estimates and the unknown ground truth. Alemohammad et al. [24] improved the TC approach for precipitation applications by introducing the multiplicative error model to replace the additive error model. ...
Article
Full-text available
Obtaining accurate near-real-time precipitation data and merging multiple precipitation estimates require sufficient in-situ rain gauge networks. The triple collocation (TC) approach is a novel error assessment method that does not require rain gauge data and provides reasonable precipitation estimates by merging data; this study assesses the TC approach for producing reliable near-real-time satellite-based precipitation estimate (SPE) products and the utility of the merged SPEs for hydrological modeling of ungauged areas. Three widely used near-real-time SPEs, including the Integrated Multi-satellite Retrievals for Global Precipitation Measurement (IMERG) early/late run (E/L) series, and the Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks-Dynamic Infrared Rain Rate (PDIR) products, are used in the Beijiang basin in south China. The results show that the TC-based merged SPEs generally outperform all original SPEs, with higher consistency with the in-situ observations, and show superiority over the simple equal-weighted merged SPEs used for comparison; these findings indicate the superiority of the TC approach for utilizing the error characteristics of input SPEs for multi-SPE merging for ungauged areas. The validation of the hydrological modeling utility based on the Génie Rural à 4 paramètres Journalier (GR4J) model shows that the streamflow modeled by the TC-based merged SPEs has the best performance among all SPEs, especially for modeling low streamflow because the integration with the PDIR outperforms the IMERG products in low streamflow modeling. The TC merging approach performs satisfactorily for producing reliable near-real-time SPEs without gauge data, showing great potential for near-real-time applications, such as modeling rainstorms and monitoring floods and flash droughts in ungauged areas.
Article
Assessing the uncertainty of precipitation measurements is a challenging prob-lem because precipitation estimates are inevitably influenced by various errorsand environmental conditions. A way to characterize the error structure ofcoincident measurements is to use the triple colocation (TC) statistical method.Unlike more typical approaches, where measures are compared in pairs andone of the two is assumed error-free, TC has the enviable advantage to succeedin characterizing the uncertainties of co-located measurements being com-pared to each other, without requiring the knowledge of the true value whichis often unknown. However, TC requires to have at least three co-located mea-suring systems and the compliance with several initial assumptions. In thiswork, for the first time, TC is applied to in-situ measurements of rain precipita-tion acquired by three co-located devices: a weighing rain gauge, a laser disd-rometer and a bidimensional video disdrometer. Both parametric andnonparametric formulations of TC are implemented to derive the rainfall prod-uct precision associated with the three devices. While the parametric TC tech-nique requires tighter constraints and explicit assumptions which may beviolated causing some artifacts, the nonparametric formulation is more flexibleand requires less strict constrains. For this reason, a comparison between thetwo TC formulations is also presented to investigate the impact of TC con-strains and their possible violations. The results are obtained using a statisti-cally robust dataset spanning a 1.5 year period collected in Switzerland andpresented in terms of traditional metrics. According to triple colocation analy-sis, the two disdrometers outperform the classical weighing rain gauge andthey have similar measurement error structure regardless of the integrationtime intervals.
Article
Precipitation is a vital pillar in the most of hydro-climatological studies. To measure its stochastic behavior, recent technological advancements have provided new sources of High-Resolution Precipitation Products (HRPPs), which could be utilized to overcome limitations of the ground measurements. However, accuracy of HRPPs is not the same in different regions and climates and therefore, should be assessed prior to any practical application. In this study, monthly datasets of ten HRPPs, known as CHIRPS, CMORPH, ERA5-Land, GPM_3IMERGM, MSWEP V2, PERSIANN, PERSIANN-CCS, PERSIANN-CDR, TerraClimate, and TRMM_3B43, are assessed over Central Plateau watershed located in central Iran during 2005 to 2015. Lack of previous studies as well as remarkable variations in altitude and climate of this watershed, make it a suitable region for studying spatiotemporal pattern of precipitation and evaluating HRPPs. For this purpose, two approaches are implemented; comparing the products with ground measurements and together using Triple Collocation (TC). Through the first approach, an error decomposition scheme is utilized besides the other statistical metrics to further investigate total and seasonal accuracy of the HRPPs; Köppen-Geiger climate classification indicators are used to assess climate-based spatial performance of the HRPPs. According to the results, most of the HRPPs underestimate and overestimate precipitation values in wetter and drier climates, respectively. Additionally, winter contributes more than any other season to the biases of the products. While GPM_3IMERGM is the most accurate HRPP in the region with NRMSE, NSE, and KGE of 0.95, 0.62, and 0.73, respectively, PERSIANN-CCS results in the lowest accuracy with NRMSE, NSE, and KGE of 2.09, -0.82, and -0.02, respectively.
Article
Characterizing error structures in precipitation products not only facilitates their proper applications for scientific and practical purposes but also helps improve their retrieval algorithms and processing methods. Despite the fact that multiple precipitation products have been assessed in the literature, factors that affect their error structures remain inadequately addressed. By interpreting 60 binary decision trees, this study disentangles the error characteristics of precipitation products in terms of their spatiotemporal patterns and geographical factors. Three independent precipitation products - two satellite-based and one reanalysis datasets: the Integrated Multi-satellitE Retrievals for GPM (Global Precipitation Measurement) late run (IMERG-L), Soil Moisture to Rain-Advanced SCATterometer (SM2RAIN-ASCAT), and the Modern-Era Retrospective analysis for Research and Applications, Version 2 uncorrected precipitation output (MERRA2-UC), are evaluated across the contiguous United States from 2010 to 2019. The ground-based Stage IV precipitation dataset is used as the ground truth. Results indicate that the MERRA2-UC outperforms the IMERG-L and SM2RAIN-ASCAT with higher accuracy and more stable interannual patterns for the analysis period. Decision trees cross-assess three spatiotemporal factors and find that the underestimation of MERRA2-UC occurs in the east of the Rocky Mountains, and SM2RAIN-ASCAT underestimates precipitation over high latitudes, especially in winter. Additionally, the decision tree method ascribes system errors to nine different geographical characteristics, of which the distance to the coast, soil type, and DEM are the three dominant features. On the other hand, the land cover type, topography position index, and aspect are three relatively weak factors.
Technical Report
Full-text available
Triple collocation is a method that is now widely used to characterize systematic biases and random errors in in-situ measurements, satellite observations and model fields. It attempts to segregate the measurement uncertainties, geophysical, spatial and temporal representation and sampling differences in the different data sets by an objective method.
Article
Full-text available
Understanding the error structures of remotely sensed soil moisture products is essential for correctly interpreting observed variations and trends in the data or assimilating them in hydrological or numerical weather prediction models. Nevertheless, a spatially coherent assessment of the quality of the various globally available data sets is often hampered by the limited availability over space and time of reliable in-situ measurements. This study explores the triple collocation error estimation technique for assessing the relative quality of several globally available soil moisture products from active (ASCAT) and passive (AMSR-E and SSM/I) microwave sensors. The triple collocation technique is a powerful tool to estimate the root mean square error while simultaneously solving for systematic differences in the climatologies of a set of three independent data sources. In addition to the scatterometer and radiometer data sets, we used the ERA-Interim and GLDAS-NOAH reanalysis soil moisture data sets as a third, independent reference. The prime objective is to reveal trends in uncertainty related to different observation principles (passive versus active), the use of different frequencies (C-, X-, and Ku-band) for passive microwave observations, and the choice of the independent reference data set (ERA-Interim versus GLDAS-NOAH). The results suggest that the triple collocation method provides realistic error estimates. Observed spatial trends agree well with the existing theory and studies on the performance of different observation principles and frequencies with respect to land cover and vegetation density. In addition, if all theoretical prerequisites are fulfilled (e.g. a sufficiently large number of common observations is available and errors of the different data sets are uncorrelated) the errors estimated for the remote sensing products are hardly influenced by the choice of the third independent data set. The results obtained in this study can help us in developing adequate strategies for the combined use of various scatterometer and radiometer-based soil moisture data sets, e.g. for improved flood forecast modelling or the generation of superior multi-mission long-term soil moisture data sets.
Article
Full-text available
Drought in East Africa is a recurring phenomenon with significant humanitarian impacts. Given the steep climatic gradients, topographic contrasts, general data scarcity, and, in places, political instability that characterize the region, there is a need for spatially distributed, remotely derived monitoring systems to inform national and international drought response. At the same time, the very diversity and data scarcity that necessitate remote monitoring also make it difficult to evaluate the reliability of these systems. Here we apply a suite of remote monitoring techniques to characterize the temporal and spatial evolution of the 2010–2011 Horn of Africa drought. Diverse satellite observations allow for evaluation of meteorological, agricultural, and hydrological aspects of drought, each of which is of interest to different stakeholders. Focusing on soil moisture, we apply triple collocation analysis (TCA) to three independent methods for estimating soil moisture anomalies to characterize relative error between products and to provide a basis for objective data merging. The three soil moisture methods evaluated include microwave remote sensing using the Advanced Microwave Scanning Radiometer – Earth Observing System (AMSR-E) sensor, thermal remote sensing using the Atmosphere-Land Exchange Inverse (ALEXI) surface energy balance algorithm, and physically-based land surface modeling using the Noah land surface model. It was found that the three soil moisture monitoring methods yield similar drought anomaly estimates in areas characterized by extremely low or by moderate vegetation cover, particularly during the below-average 2011 long rainy season. Systematic discrepancies were found, however, in regions of moderately low vegetation cover and high vegetation cover, especially during the failed 2010 short rains. The merged, TCA-weighted soil moisture composite product takes advantage of the relative strengths of each method, as judged by the consistency of anomaly estimates across independent methods. This approach holds potential as a remote soil moisture-based drought monitoring system that is robust across the diverse climatic and ecological zones of East Africa.
Article
Full-text available
Triple collocation analysis (TCA) enables estimation of error variances for three or more products that retrieve or estimate the same geophysical variable using mutually independent methods. Several statistical assumptions regarding the statistical nature of errors (e.g., mutual independence and orthogonality with respect to the truth) are required for TCA estimates to be unbiased. Even though soil moisture studies commonly acknowledge that these assumptions are required for an unbiased TCA, no study has specifically investigated the degree to which errors in existing soil moisture datasets conform to these assumptions. Here these assumptions are evaluated both analytically and numerically over four extensively instrumented watershed sites using soil moisture products derived from active microwave remote sensing, passive microwave remote sensing, and a land surface model. Results demonstrate that nonorthogonal and error cross-covariance terms represent a significant fraction of the total variance of these products. However, the overall impact of error cross correlation on TCA is found to be significantly larger than the impact of nonorthogonal errors. Because of the impact of cross-correlated errors, TCA error estimates generally underestimate the true random error of soil moisture products.
Article
Full-text available
Using collocations of three different observation types of sea surface temperatures (SSTs) gives enough information to enable the standard deviation of error on each observation type to be derived. SSTs derived from the Advanced Along-Track Scanning Radiometer (AATSR) and Advanced Microwave Scanning Radiometer for Earth Observing System (EOS; AMSR-E) instruments are used, along with SST observa-tions from buoys. Various assumptions are made within the error theory, including that the errors are not correlated, which should be the case for three independent data sources. An attempt is made to show that this assumption is valid and that the covariances between the different observations because of represen-tativity error are negligible. Overall, the spatially averaged nighttime AATSR dual-view three-channel bulk SST observations for 2003 are shown to have a very small standard deviation of error of 0.16 K, whereas the buoy SSTs have an error of 0.23 K and the AMSR-E SST observations have an error of 0.42 K.
Article
Full-text available
Root Mean Square Errors (RMSEs) in the soil moisture anomaly time series obtained from the Advanced Scatterometer (ASCAT) and the Advanced Microwave Scanning Radiometer (AMSR-E; using the Land Parameter Retrieval Model) are estimated over a continental scale domain centered on North America, using two methods: triple colocation (RMSETC) and error propagation through the soil moisture retrieval models (RMSEEP). In the absence of an established consensus for the climatology of soil moisture over large domains, presenting a RMSE in soil moisture units requires that it be specified relative to a selected reference data set. To avoid the complications that arise from the use of a reference, the RMSE is presented as a fraction of the local time series standard deviation (fRMSE). For both sensors, the fRMSETC and fRMSEEP show similar spatial patterns of relatively high/low errors, and the mean fRMSE for each land cover class is consistent with expectations. Triple colocation is also shown to be surprisingly robust to representativity differences between the soil moisture data sets used, and it is believed to accurately estimate the fRMSE in the remotely sensed soil moisture anomaly time series. Comparing the ASCAT and AMSR-E fRMSETC shows that in general both data sets have good skill over low to moderate vegetation cover. Additionally, they have similar accuracy even when considered by land cover class, although the AMSR-E fRMSEs show a stronger signal of the vegetation cover.
Article
Using co-locations of three different observation types of sea surface temperatures (SSTs) gives enough information to enable the standard deviation of error on each observation type to be derived. SSTs derived from the Advanced Along-Track Scanning Radiometer (AATSR) and Advanced Microwave Scanning Radiometer (AMSR-E) instruments are used, along with SST observations from buoys. Various assumptions are made within the error theory including that the errors are not correlated, which should be the case for three independent data sources. An attempt is made to show that this assumption is valid and also that the covariances between the observations due to representativity error are negligible. Overall, the AATSR observations are shown to have a very small standard deviation of error of 0.16K, whilst the buoy SSTs have an error of 0.23K and the AMSR-E SST observations have an error of 0.42K.
Article
Triple collocation (TC) is routinely used to resolve approximated linear relationships between different measurements (or representations) of a geophysical variable that are subject to errors. It has been utilised in the context of calibration, validation, bias correction, and error characterisation to allow comparisons of diverse data records from various direct and indirect measurement techniques including in situ, remote sensing and model-based approaches. However, successful applications of TC require sufficiently large numbers of coincident data points from three independent time-series and, within the analysis period, homogeneity of their linear relationships and error structures. These conditions are difficult to realise in practice due to infrequent spatiotemporal sampling of satellite and ground-based sensors. TC can however be generalised within the framework of instrumental variable (IV) regression theory to address some of the conceptual constraints of TC. We review the theoretics of IV and consider one possible strategy to circumvent the three-data constraint by use of lagged variables (LV) as instruments. This particular implementation of IV is suitable for circumstances where multiple data records are limited and the geophysical variable of interest is sampled at time intervals shorter than its temporal correlation length. As a demonstration of utility, the LV method is applied to microwave satellite soil moisture data sets to recover their errors over Australia, and to estimate temporal properties of their relationships with in situ and model data. These results are compared against standard two-data linear estimators and the TC estimator as benchmark.