Technical ReportPDF Available

Minimum-Variance Coefficients for the Generalized Multivariate Difference Estimator (GMDe)

Authors:

Abstract and Figures

The Generalized Multivariate Difference estimator (GMDe) is a broad generalization of the univariate "difference estimator" described by Hansen et al. (1953:250-253) and Särndal et al. (1992:239-242). Difference estimators use population estimates of auxiliary variables to improve population estimates of correlated study variables. Examples of auxiliary variables include administrative records, remotely sensed measurements, and time-series of predictions from deterministic process models (e.g., econometric models, demographic models, forest-stand projection models). GMDe is a multivariate alternative to model-assisted GREG regression estimators for finite populations, such as post-stratification, ratio, regression, lasso, ridge, and elastic net estimators. GMDe does not require model-assisted predictions of a proxy variable, nor does GMDe require known population totals for all auxiliary variables. Much like the composite estimator, GMDe is a simple linear transformation of a vector of population estimates from a probability sample with design-consistent multivariate Horvitz-Thompson "π-estimator". Therefore, GMDe does not directly use the data matrix of study variables and auxiliary variables for each population element included in the probability sample. This Technical Report derives the M×J matrix of minimum-variance coefficients for each of M study variables and each of J auxiliary variables for the linear transformation in GMDe. The degree of variance reduction with GMDe depends, in part, upon the strength of correlations between study variables and auxiliary variables. Substantial improvements of GMDe relative to the prior π-estimate require relatively strong correlations (e.g., ±0.70 and stronger). In a National Forest Inventory (NFI), remotely sensed auxiliary variables are sufficiently correlated with broad groupings of domains. However, predictions from deterministic process models might provide auxiliary variables that are more strongly correlated with detailed study variables that change slowly or more predictably over time; and change-detection with remotely sensed data can post-stratify the population into undisturbed strata for which deterministic process models provide stronger predictors. This Technical Report includes a simple example of the recursive version of GMDe, which is a relatively simple estimator for complex sample surveys that include longitudinal surveys for time-series of population estimates; interpenetrating panels; multi-phase and multi-stage sampling; and multiple independent surveys. If a design-based π-estimate is feasible for a vector of study variables and correlated auxiliary variables, then GMDe can use those π-estimates for the population to reduce variances of study variables that are correlated with auxiliary variables. The recursive GMDe can also impose equality and inequality constraints on study variables and mitigate influence of outliers. The recursive GMDe replaces inversion of the J×J partition of the π-covariance matrix for auxiliary residuals with a sequence of J scalar divisions.
Content may be subject to copyright.