ArticlePDF Available

Evaluation of small area estimation methods - An application to unemployment estimates from the UK LFS

Authors:

Abstract and Figures

This paper describes joint research by the ONS and Southampton University on the evaluation of several different approaches to the local estimation of ILO unemployment. The need to compare estimators with different underlying assumptions has led to a focus on evaluation methods that are (partly at least) model- independent. Model fit diagnostics that have been considered include various residual procedures, cross- validation, predictive validation, consistency with marginals, and consistency with direct estimates within single cells. These have been used to compare different model-based estimators with each other and with direct estimators.
Content may be subject to copyright.
Proceedings of Statistics Canada Symposium 2001
Achieving Data Quality in a Statistical Agency: A Methodological Perspective
EVALUATION OF SMALL AREA ESTIMATION METHODS – AN
APPLICATION TO UNEMPLOYMENT ESTIMATES FROM THE
UK LFS
Gary Brown1, Ray Chambers2, Patrick Heady1, Dick Heasman1
ABSTRACT
This paper describes joint research by the ONS and Southampton University on the evaluation of several
different approaches to the local estimation of ILO unemployment. The need to compare estimators with
different underlying assumptions has led to a focus on evaluation methods that are (partly at least) model-
independent. Model fit diagnostics that have been considered include various residual procedures, cross-
validation, predictive validation, consistency with marginals, and consistency with direct estimates within single
cells. These have been used to compare different model-based estimators with each other and with direct
estimators.
KEY WORDS:Diagnostics; Estimates; Bias; Standarderrors; Confidence intervals.
1. INTRODUCTION
A small area estimation methodology can be thought of as a model plus fitting method for the small area
values of interest coupled with an estimation method based on the fitted model. Basic properties that we
require of such a methodology are:
1. The expected values defined by the model underlying the small area values should be “good”. That is,
they should explain a significant proportion of the variation in the small area values of interest. Note
that for models that include random effects, these do not contribute to the expected value.
2. The values for the model-based estimates derived from the fitted model should be consistent with the
unbiased direct survey estimates, where these are available. That is, they should provide an
approximation to the direct estimates that is consistent with these values being "close" to the expected
values of the direct estimates.
3. The model-based small area estimates should have mean squared errors significantly lower than the
variances of corresponding direct estimates.
4. The changes over time in the model-based estimates for a particular small area should be more stable
than the corresponding changes in the direct estimates over the same time.
5. The model-based estimates for a particular small area should be acceptable to informed users from that
small area.
This paper does not attempt to cover all of the above agenda. Clearly, standard model-fitting diagnostics
can be used to assess property 1 – and we have also restricted the discussion to indicators that relate to a
single point in time, thus excluding point 4. The most important omission, however, is point 5. Despite the
fact that user-consultation is not discussed here, ONS takes the process of consultation very seriously
indeed – both as a way of ensuring public acceptance, and as a valuable input to improving the estimates
themselves.
1Office for National Statistics, 1 Drummond Gate, London, SW1V 2QQ, U.K.
2Department of Social Statistics, University of Southampton, SO17 1BJ, U.K.
2
This paper is about the preliminary internal evaluation work needed to select a suitable small area estimator
in situations where there are a number of competing small area models that are not necessarily nested and
there is some doubt about the assumptions underpinning all of these models. In particular, we discuss four
diagnostics that we have found useful in this regard. These assess the bias and goodness of fit of the
estimation method, the coverage of the confidence intervals generated by the method and the calibration
error of the method. All are based on the crucial assumption that the direct estimates of the small area
values of interest are unbiased (but highly variable) and the confidence intervals associated with these
estimates achieve their nominal coverage levels.
These diagnostics have been developed in the process of investigating small area estimators for both
unemployment and a range of other socio-economic variables. The theory behind these estimators is
described in Ambler et al (2001) and ONS (2001). In the following section we describe the diagnostics in
more detail, applying them to a small area unemployment estimator from Ambler et al (2001). In practice
several diagnostics are used at each stage of the model selection process.
2. DIAGNOSTICS
2.1 A bias diagnostic
-{The direct estimates are unbiased estimates of the “truth” – if the truth were known and plotted on
the X axis of a graph, with direct estimates as Y, the regression line would fall on 45o.Weplotthe
model estimates as X, in place of the “truth”, and see how close the regression line is to Y=X. This
provides a visual illustration of bias, and by comparing the regression line with Y=X, a parametric
significance test for the bias of the model estimates3.}
The diagnostic is based on the following idea. If the model-based estimates are "close" to the small area
values of interest, then unbiased direct estimators should behave like random variables whose expected
values correspond to the values of the model-based estimates. That is, the model-based estimates should be
unbiased predictors of the direct estimates. As a check for such predictive (i.e. conditional) bias in the
model-based estimates, we plot appropriately scaled values of these estimates (X-axis) against similarly
scaled direct estimates (Y-axis) and then test whether the OLS (ordinary least squares) regression line fitted
to these points is significantly different from the identity line3.
When there is significant variation in small area sizes this test typically requires an initial transformation of
both the direct and model-based estimates so that the homoskedasticity assumption underpinning the OLS
fitting method is satisfied. Such a transformation can be identified using standard methods. In our
unemployment example below, a square root transformation was used, since the estimates relate to counts
of unemployed people in the small areas of interest.
The use of this diagnostic is straightforward when the focus of interest is on small area totals since
unbiased direct estimators of such totals are typically available. The use of transformations to stabilise the
residual variance in the plot will of course introduce a slight bias, but we feel that this is acceptable.
However the issue becomes more complex when the focus of interest is on small area proportions, because
the denominator of the direct estimator of such a proportion is typically a random variable and so the
proportion is in effect a ratio estimator and hence biased. We have adopted two different strategies in this
case: (I) Concentrate on the numerator of the estimated proportion - the estimated small area total -
since this can be estimated without bias.
(II) Compare the direct and model-based estimators of the proportion and accept that the resulting
ratio bias may slightly distort the interpretation of the diagnostic.
3The calculated significance values do not allow for the fact that the X values are derived from the same
data as the Y values. This will often make the true rejection probability of the test lower than its face value
– in which case an apparent rejection would be more significant than it seemed.
3
The relative attractiveness of these two options depends in part on characteristics of the sample and of the
population of the small areas of interest. In the case of strategy (I) there is a danger that, if these population
sizes vary a great deal, the pattern shown in the scatterplot will owe more to this variability than to the
biasedness or otherwise of the model-based estimators of the small area proportions. In the case of strategy
(II) the lower the coefficient of variation of the denominator of the small area proportion, the lower the risk
of serious bias in the direct estimate of the proportion, and hence the more applicable the diagnostic.
Finally, if the model underlying the small area estimates is actually fitted using proportions, strategy (II)
can also be interpreted in the following way. It provides a way of looking for bias due to model
misspecification - but cannot be expected to discover any bias in the direct estimates which were used in
fitting the model.
Example
In this example the variable of interest is the number of individuals who are unemployed according to the
International Labour Organisation definition (“ILO unemployed”) in 406 LAD/UAs (Local Authority
Districts and Unitary Authorities) in Great Britain. Direct estimates for this variable are available annually
from the Local Area Data Base of the UK Labour Force Survey (LFS). A number of logistic models for the
probability of being unemployed in a LAD/UA were fitted to these data and used to define model-based
estimates for these small areas. See Ambler et al (2001) for further details. Here we focus on the modified
Fay-Herriot approach concentrated upon in that paper. The model includes covariates defined by the
claimant count in six age by sex cells within each LAD/UA (the claimant count is the number of people
who claim unemployment benefits), as well as age by sex effects, LAD/UA regional effects and LAD/UA
socio-economic classification effects. In addition, the model includes an extra area level effect, defined by
the logit of the total LAD/UA claimant count (as a proportion of the LAD/UA population), which is used a
measure of the overall economic activity within the LAD/UA, and consequently reflects an individual's
opportunity to obtain employment in that area.
The OLS regression parameters, with standard errors in brackets, from the bias scatterplot for the five years
1995/1996 to 1999/2000 are given in Table 1. None of the regression lines show a significant difference
from Y=X. A visual illustration is given in Figure 1 for 1999/2000, where the Y=X and regression lines
show very little disparity.
Intercept Slope
1995/1996 0.463 (1.062) 0.989 (0.014)
1996/1997 -1.354 (1.060) 1.010 (0.014)
1997/1998 -0.832 (1.108) 1.003 (0.016)
1998/1999 -0.615 (1.104) 0.999 (0.017)
1999/2000 -1.927 (1.135) 1.017 (0.018)
Table 1 OLS regression parameters from bias scatterplots
4
sqrt(model-based estimate)
250200150100500
sqrt(direct estimate)
250
200
150
100
50
0
Figure 1 Bias scatterplot for 1999/2000 with Y=X and regression lines fitted
The interpretation changes when scatterplots and regression lines are fitted for the estimated proportions of
individuals who are ILO unemployed for LAD/UAs. Figure 2 shows the scatterplot for 1999/2000, with a
regression line Y = –0.0145(0.009) + 1.059(0.048)X.This shows more disparity from Y=X and hence more
possible bias, although the evidence is still not statistically significant.
sqrt(model-based estimate)
.4.3.2.10.0
sqrt(direct estimate)
.4
.3
.2
.1
0.0
Figure 2 Bias scatterplot for proportions for 1999/2000 with Y=X and regression lines fitted
2.2 A goodness of fit diagnostic
-{We want our model estimates to be close to the direct estimates when the direct estimates are good.
We inversely weight their squared difference by their variance and sum over all areas – this sum gives
more weight to differences from good direct estimates than from bad. We test this sum against the
2
χ
distribution to provide a parametric significance test of bias of model estimates relative to their
precision.}
5
As a check for unconditional bias in the model-based estimates we use a Wald goodness of fit statistic to
test whether there is a significant difference between the expected values of the direct estimates and the
model-based estimates.
In order to describe this test, we assume that the variable of interest is unemployment status and the
available data consist of direct estimates (from the LFS) of the population proportion unemployed by age
and sex in each small area, together with model-based estimates of the population proportion unemployed
in each small area. Let i denote age-sex class and j denote small area. We assume that age-sex classes are
finely enough defined so that within an age-sex class in an small area there is little or no variation in the
sample weights. This allows us to define an "average" weight for all individuals in age-sex class i in small
area j,
(1)
ij
s(ij)kk
ij n
w
w
=
and to therefore approximate the direct estimate ij
ˆ
zof the population proportion unemployed in age-sex
class i in small area j by
(2) ij
ij
s(ij)kk
ijij
s(ij)kkij
ij p
ˆ
n
U
nw
Uw
z
ˆ==
where s(ij) denotes the individuals in age-sex class i who are in the survey sample in small area j, Uktakes
the value 1 if individual k is unemployed and zero otherwise, and wkis the survey sample weight attached
to individual k. Note that
ˆ
pij isjust the sample proportion of interest in age-sex class i and small area j. The
corresponding approximation to the direct estimate of the proportion of unemployed in the small area is
then
(3)
=
iijij
iijij
iijij
ij
iijij
jnw
uw
nw
p
ˆ
nw
z
ˆ
where ij
udenotes the total number of unemployed in sample in age-sex class i in small area j. Since the
sample design of the LFS is essentially simple random sampling within a small area, we can model the
sample counts {uij}and{n
ij} as realisations of correlated multinomial random variables. In particular, let
ij
π
be the probability that a randomly chosen individual in age-sex class i and small area j has the
characteristic of interest and ij
φ
be the probability that a randomly chosen individual in small area j is in
age-sex class i. Then
(4) ijijjij n)E(u
φπ
=,)-(1n)var(u ijijijijjij
φπφπ
=,jijiijijjjiij -n)u,cov(u =
φπφπ
(5) ijjij n)nE(
φ
=,)1(n)var(n ijijjij
φφ
= ,jiijjjiij -n)n,cov(n =
φφ
and
(6) )-(1n)n,cov(u ijijijjijij
φφπ
=,jiijijjjiij -n)n,cov(u =
φφπ
.
6
First order approximations to the expected value and variance of j
z
ˆare
(7) j
iijij
iijijij
jw
w
)z
ˆ
E(
ζ
φ
πφ
=
,2
j
T
j
j
j
jj
T
j
j))nE(w(
w)nuvar(w
)z
ˆ
var(
ζ
where
u
j,
n
jand
w
jare the vectors with components {uij}, {nij}and{w
ij} respectively. Furthermore, the
components of )n-uvar( j
jj
ζ
are given by
(8)
+
= 2
jijj
ij
ijijij
ijijjijjij 2
1
)1(
)1(n)n-var(u
ζπζ
φφππ
φφζ
and
(9) )-)(-(-n)n-u,n-cov(u jjijijjiijjjijjiijjij
ζπζπφφζζ
=.
Typically, small area-level model-based and direct survey estimates will be approximately uncorrelated.
Consequently, a Wald statistic for testing the small area-level goodness-of-fit of a model-based set of
estimates of interest is
(10) +
=jjj
2
jj
)
ˆ
(V
ˆ
)z
ˆ
(V
ˆ)
ˆ
z
ˆ
(
W
ζ
ζ
where j
ζ
is the model-based estimate of the proportion of small area j population that are unemployed,
)
(V
ˆj
ζ
is its estimated variance and
(11) 2
j
j
j
jj
T
j
2
j
T
j
j
j
jj
T
j
jN
ˆw)nu(V
w
))n(E
ˆ
w(
w)nu(V
w
)z
ˆ
(V
ˆ
ζζ
=
where j
N
ˆis the survey estimate of the population of the small area and )nu(V
ˆj
j
j
ζ
is a matrix with
diagonal components
(12)
+
2
jijj
ijj
ijijjij
j
ij
ij ˆ
ˆ
ˆ
2
nn
)n
ˆ
n(
ˆ
n
n
1n
ζπζ
ππ
and off-diagonal components
(13) )
ˆ
ˆ
)(
ˆ
ˆ
(
n
nn
jjijij
j
jiij
ζπζπ
.
Here ij
ˆ
π
is the model-based estimate of the proportion unemployed in age-sex class i in small area j.
7
Under the hypothesis that the model-based estimates are equal to the expected values of the direct
estimates, and provided the sample sizes in the small areas are sufficient to justify central limit
assumptions, W will then have a 2
χ
distribution with degrees of freedom equal to the number of small
areas in the population.
Example
We continue with the model-based approach introduced earlier for the same five years. The goodness of fit
statistics are in Table 2. None of the statistics show evidence to reject a 2
χ
distribution, if fact the fit
seems almost too good. This may be an artefact of including estimated between LAD/UA variance in the
mean squared error of the model-based estimates, which errs on the side of caution.
1995/1996 1996/1997 1997/1998 1998/1999 1999/2000
349.84
[p-value 0.98] 358.80
[p-value 0.96] 376.21
[p-value 0.85] 349.59
[p-value 0.98] 377.85
[p-value 0.84]
Table 2 Goodness of fit statistic values with [p-values]
2.3 A coverage diagnostic
-{95% Confidence intervals for the direct estimates should contain the “truth” 95% of the time. So
should the confidence intervals surrounding model-based estimates. We adjust both sets of intervals,
so that their chance of overlapping should be 95% and count how often they actually do overlap.
Assuming that the estimated coverage of the direct confidence intervals is correct, comparing the
counts to the Binomial distribution provides a non-parametric significance test of the bias of model
estimates relative to their precision.}
This diagnostic evaluates the validity of the confidence intervals generated by the model-based small area
estimation procedure. It assumes that valid 95 percent confidence intervals for the small area values of
interest can be generated from the direct estimates. The basic idea then is to measure the overlap between
these direct confidence intervals and corresponding 95 percent confidence intervals generated by the
model-based estimation procedure. However, since the degree of overlap between two independent 95
percent confidence intervals for the same quantity will be higher than 95 percent, it is necessary to first
modify the nominal coverage levels of the confidence intervals being compared in order to ensure a
nominal 95 percent overlap.
This modification is based on the fact that if X and Y are two independent normal random variables, with
the same mean but with different standard deviations, X
σ
and Y
σ
respectively, and if z(
α
)issuchthat
the probability that a standard normal variable takes values greater than z(
α
)is 2/
α
, then a sufficient
condition for there to be probability of
α
that the two intervals X ± z(
β
)X
σ
and Y ± z(
β
)Y
σ
do not
overlap is when
(14) 2
Y
2
X
-1
Y
X11)z()z(
σ
σ
σ
σ
αβ
+
+= .
Consequently, this diagnostic takes z(
α
) = 1.96, calculates z(
β
) using the above formula, with X
σ
replaced by the estimated standard error of the model-based estimate and Y
σ
replaced by the estimated
standard error of the direct estimate and then computes the overlap proportion between the corresponding
8
z(
β
)-based confidence intervals generated by the two estimation methodologies. Nominally, for z(
α
)=
1.96, this overlap proportion should be 95 percent. Note that z(
β
)=z(
α
)when X
σ
=0.
This diagnostic can also be used to assess the need to include a small area random effect in the model, by
just looking at the proportion of direct estimate-based confidence intervals that cover the model-based
estimates of the expected values of the small area quantities of interest. Ideally, if the model-based
estimator is essentially the small area quantity of interest, then around 5% of the small areas will record
such noncoverage. However, if small area level random effects are present (i.e. a multilevel model is more
appropriate than a single level model) then more than 5% of small areas will necessarily show
noncoverage. Used in this way, this diagnostic can be interpreted in two ways, as a test for bias in a single
level model, or as a test for whether a multilevel model is needed – the interpretation depending on whether
a single level model is known to be sufficient or not.
Example
We continue withthe model-based approach introducedearlier for the same five years. Non-coverage totals
and percentages are shown in Table 3 (we filter out zero direct estimates of unemployment). For 1997/1998
and 1999/2000 there is significant evidence to reject 5% non-coverage. However, this means we have
overcoverage, and the mean squared error of the model-based estimates is too large. As this is erring
towards giving conservative confidence intervals it is not a major cause for concern.
1995/1996 1996/1997 1997/1998 1998/1999 1999/2000
11 out of 406
(2.7%)
[p-value 0.03]
13 out of 406
(3.2%)
[p-value 0.11]
17 out of 406
(4.2%)
[p-value 0.54]
11 out of 406
(2.7%)
[p-value 0.03]
13 out of 406
(3.2%)
[p-value 0.11]
Table 3 Non-coverage totals with (percentages) and [p-values]
2.4 A calibration diagnostic
-{Calculating how much modelled estimates differ from direct estimates when aggregated to larger
domains shows us whether any particular larger domain is estimated worse than any other. For
example this may show how a model may be poorly estimating large urban areas, whilst estimating
large rural areas well. This provides some evidence regarding spatial bias/autocorrelation of model
estimates. However, the value of the evidence depends on the size of domains in question.}
The final diagnostic we consider is the amount of scaling required to calibration a set of model-based small
area estimates. This measure is based on what is typically a key requirement for small area estimates - that
they sum to direct estimates at appropriate levels of aggregation. We refer to this property as calibration.
The basis for this requirement is simple. Large sample sizes at higher levels of aggregation mean that the
direct estimates can be considered to accurate at these levels. Consequently, given two sets of model-based
estimates, one that agrees with the direct estimates under appropriate aggregation and one that does not, we
prefer the former. In practice, since model-based small area estimates are calibrated, usually by appropriate
scaling, checking calibration after such scaling is irrelevant. However, by calculating the relative difference
between the aggregated model-based estimates prior to this calibration and the aggregated direct estimates
we obtain a measure of how accurate the aggregated model-based estimates are, and provides a means to
compare different models.
An interesting issue to consider when using this diagnostic is deciding the calibration level. Since the
aggregated direct estimates to which the aggregated model-based estimates are being compared are
themselves subject to sampling variation, it is inappropriate to calibrate at too low a level of aggregation. It
is important to identify this "cut-off" size when considering what calibration to perform.
9
Example
We continue with the model-based approach introduced earlier for the same five years. Our model-based
estimates are required to be consistent with direct estimates at three margins: 6 National age-sex
breakdowns; 12 Government Office Regions; 7 Socio-Economic classifications. We calculate how deviant
the uncalibrated model-based estimates are from these margins, i.e. how much calibration is required to
achieve consistency. The results are in Table 4, in terms of the percentage increases needed to model-based
estimates in each margin. Although none of the percentages are major, the 5th category in the National age-
sex margin consistently requires the largest amount of calibration - this category is women aged 50+, for
whom the relationship between ILO unemployment and claimant count is known to be different from other
age-sex categories. Overall the calibration required is increasing over time (as can be seen by counting the
number of values over 1% per year). Clearly, future performance of the model will need to be monitored,
although at present these percentage differences are not excessive.
National age-sex Government Office
Region Socio-Economic
Classification
1995/1996 0.4 -0.6 0.3 0.3
1.8 -1.1 0.9 0.4 -0.1 1.1
1.2 -0.1 0.6 -0.1
0.7 0.3 0.2 0.3
0.3 -0.3 0.5 0.4
0.6 0.8 -0.5
1996/1997 -0.1 -0.4 0.5 0.3
2.2 -0.4 -0.8 0.5 -0.1 1.1
-0.5 1.4 1.5 0.8
-0.8 0.5 0.5 0.6
0.2 0.1 0.6 0.6
0.6 0.6 -0.3
1997/1998 -0.3 -1.2 0.0 0.6
2.8 0.8 0.7 0.5 -0.3 1.5
0.4 1.0 0.5 -0.8
0.0 1.3 0.2 0.0
0.6 0.3 0.0 -0.2
0.6 1.0 -2.1
1998/1999 -0.1 -0.4 0.7 1.1
2.2 -0.6 0.0 0.6 1.3 0.5
0.3 1.0 0.6 -0.2
1.6 1.2 0.1 0.0
1.3 1.0 1.2 -0.6
0.1 0.9 1.7
1999/2000 0.1 -0.9 0.7 0.8
3.9 1.6 1.8 1.1 1.2 0.7
0.0 0.7 0.0 1.0
0.9 2.1 -0.3 0.7
1.0 1.3 0.7 1.0
0.4 -0.2 2.3
Table 4 Percentage increases needed to model-based estimates, by margin, to achieve consistency with
direct estimates
3. CONCLUDING REMARKS
In the previous section we have presented four diagnostics that we have found useful for both assessing the
"fit" of a set of model-based small area estimates as well as comparing competing estimation methods (and
models). However, there are a number of other diagnostics that are currently under development. The most
relevant is a test of the robustness of the small area model to slight changes in the sample data. One
approach is via cross-validation, splitting the sample data into smaller subsets, fitting the same model to
each, and deriving a corresponding set of small area estimates. If the subsets are large enough to be
representative of the population we would expect similar models to result and similar estimates to be
obtained from each subset. The major problem here is deciding how to split the original sample data, and
whether reweighting is appropriate. With unit level sample data from each of the small areas of interest,
this should be reasonably straightforward. Unfortunately however, this is not always the case (e.g. the LFS
example, where the data consist of direct estimates by age, sex and small area). We would welcome
comments both on the methods we are currently using, and on ways in which we could add to our
diagnostic repertoire.
10
REFERENCES
Ambler, R., Caplan, D., Chambers, R., Kovacevic, M. and Wang, S. (2001), “Combining
unemployment benefits data and LFS data to estimate ILO unemployment for small areas: An
application of a modified Fay-Herriot method”, Proceedings of the International Association of
Survey Statisticians, Meeting of the International Statistical Institute, Seoul, August 2001.
ONS (2001), “Small Area Estimation Project”, unpublished report, London, U.K.: Office for National
Statistics.
... To examine the model assumptions and the reliability as well as the validity of the MFH estimates of stunting, wasting and underweight compared to DIR, UFH, and BFH model-based estimates, two types of diagnostics namely, (a) the model diagnostics, and (b) the diagnostics for the small area estimates are applied following [45]. The estimated random area effects under MFH model corresponding to the target variables stunting, wasting and underweight are expected to follow normal distributions with zero mean and constant variances In line with [39,45,46], the model-based estimates are considered as valid and reliable when modelbased small area estimates are (i) consistent with the unbiased direct survey estimates and (ii) more efficient than the direct estimates in terms of the mean squared error. ...
... To examine the model assumptions and the reliability as well as the validity of the MFH estimates of stunting, wasting and underweight compared to DIR, UFH, and BFH model-based estimates, two types of diagnostics namely, (a) the model diagnostics, and (b) the diagnostics for the small area estimates are applied following [45]. The estimated random area effects under MFH model corresponding to the target variables stunting, wasting and underweight are expected to follow normal distributions with zero mean and constant variances In line with [39,45,46], the model-based estimates are considered as valid and reliable when modelbased small area estimates are (i) consistent with the unbiased direct survey estimates and (ii) more efficient than the direct estimates in terms of the mean squared error. In this regard, some measures are considered, for example: the bias diagnostic, the goodness of fit (GoF) diagnostic, the percentage coefficient of variation (CV), the percentage ratio of CVs, and the 95-percentage confidence interval (CI). ...
... Thus, the internal and external diagnostics along with the bias diagnostic measures indicate that the MFH estimator provides unbiased and more precise estimates of district level stunting, wasting and underweight. Hence the developed MFH model-based estimates can be considered as valid and reliable as per [39,45,46]. ...
Article
Full-text available
District-representative data are rarely collected in the surveys for identifying localised disparities in Bangladesh, and so district-level estimates of undernutrition indicators-stunting, wasting and underweight-have remained largely unexplored. This study aims to estimate district-level prevalence of these indicators by employing a multivariate Fay-Herriot (MFH) model which accounts for the underlying correlation among the undernutrition indicators. Direct estimates (DIR) of the target indicators and their variance-covariance matrices calculated from the 2019 Bangladesh Multiple Indicator Cluster Survey microdata have been used as input for developing univariate Fay-Herriot (UFH), bivariate Fay-Herriot (BFH) and MFH models. The comparison of the various model-based estimates and their relative standard errors with the corresponding direct estimates reveals that the MFH estimator provides unbiased estimates with more accuracy than the DIR, UFH and BFH estimators. The MFH model-based district level estimates of stunting, wasting and underweight range between 16 and 43%, 15 and 36%, and 6 and 13% respectively. District level bivariate maps of undernutrition indicators show that districts in northeastern and southeastern parts are highly exposed to either form of undernutrition, than the districts in southwestern and central parts of the country. In terms of the number of undernourished children, millions of children affected by either form of undernutrition are living in densely populated districts like the capital district Dhaka, though undernutrition indicators (as a proportion) are comparatively lower. These findings can be used to target districts with a concurrence of multiple forms of undernutrition, and in the design of urgent intervention programs to reduce the inequality in child undernutrition at the localised district level.
... This study employs the performance of the small area estimation method, particularly EBLUP while using both official data and remote sensing data as an auxiliary variable. The evaluation, in general, follows Brown et al. Brown et al. (2001), which suggested several evaluation methods for small area estimation models, some of which are bias diagnostic, the goodness of fit diagnostic, coverage diagnostic, and calibration diagnostic. This research performs three evaluation methods, excludes the calibration diagnostic. ...
... The test is done by checking whether there is a significant difference between the expected values of the direct estimates and the model-based estimates. The goodness of fit test uses Wald statistics with the test statistics is explained in Equation (13) (Brown et al. 2001). ...
... Coverage diagnostic is based on the assumption that 95% confidence interval of direct and model-based estimates contains the true value. To test the assumption the overlap between both confidence intervals is measured, the result then compared to the Binomial distribution (Brown et al. 2001). To calculate the coverage, we attempt to determine the non-coverage probability first, which is expected to be low (Datta, Hall, and Mandal 2011). ...
Article
Along with the growing popularity of the small area estimation method, the need to utilize good auxiliary variables also increases. Remote sensing data, such as night light imagery, offers advantages such as time-cost efficiency and global coverage but is easily accessible. This research aims to implement night light intensity as an auxiliary variable for the EBLUP model to estimate per capita consumption expenditure at West Java in 2018. This research employs three scenarios of auxiliary variables usage in EBLUP model construction: official data, night light intensity, and the combination between both data. The results show that night light intensity is an efficient auxiliary variable for estimating per capita consumption expenditure. Furthermore, the EBLUP model with a combination of official data and night light as auxiliary variables gives the best accuracy with coefficient of variation (CV) as evaluation.
... The Shapiro-Wilk normality test on the residuals in this research yielded p-values for fruits, vegetables, grains, and food crops of 0.2982, 0.5828, and 0.2162, respectively, indicating the non-rejection of the hypotheses of normality. Finally, we used [46]'s goodness-of-fit diagnostic to determine whether the direct survey and model-based estimates are statistically equivalent. The null hypothesis of interest is as follows: H 0 : the direct and HB estimates are statistically equivalent. ...
... The histograms, the q-q plots, and the Shapiro-Wilk test all support the normality of the standardized residuals. Furthermore, Ref. [46]'s goodness-of-fit diagnostic revealed that HB estimates are consistent with direct estimates. In general, the root MSE of the corresponding HB estimators was lower than the root MSE of the direct survey estimators (Figure 2). ...
Article
Full-text available
The first important step toward ending hunger is sustainable agriculture, which is a vital component of the 2030 Agenda. In this study, auxiliary variables from the 2011 Population Census are combined with data from the 2016 Community Survey to develop and apply a hierarchical Bayes (HB) small area estimation approach for estimating the local-level households engaged in agriculture. A generalized variance function was used to reduce extreme proportions and noisy survey variances. The deviance information criterion (DIC) preferred the mixed logistic model with known sampling variance over the other two models (Fay-Herriot model and mixed log-normal model). For almost all local municipalities in South Africa, the proposed HB estimates outperform survey-based estimates in terms of root mean squared error (MSE) and coefficient of variation (CV). Indeed, information on local-level agricultural households can help governments evaluate programs that support agricultural households.
... In order to evaluate both the reliability and the validity of the small area estimates, we consider some diagnostics measures in Brown et al. (2001) and Chandra et al. (2011). We focus on the estimates based on the multivariate Fay-Herriot model only since these are the estimates focus of this section. ...
... In order to evaluate both the reliability and the validity of the small area estimates, we consider some diagnostics measures in Brown et al. (2001) and Chandra et al. (2011). We focus on the estimates based on the multivariate Fay-Herriot model only since these are the estimates presented in Sect. ...
Article
Full-text available
In recent years, the attention to lesbian, gay, bisexual, transgender and intersex (LGBTI) people’ rights from institutions, society and scientific bodies has clearly progressed. Although equal opportunities in employment are promoted within European countries and by the EU legislation, there are still evident discriminations in Europe. Many LGBTI people still face bullying and anti-LGBTI discrimination in the workplace and job market. Considerably more progress must be made before every LGBTI person feels accepted and comfortable for who they are in the workplace. Importantly, views on equal opportunities in employment are characterised by spatial heterogeneity at a sub-national level. Therefore, it is necessary to disaggregate estimates of relevant indicators, at least, at a regional level. This is crucial to identify the regions requiring more attention by policy makers. However, large-scale sample surveys are not designed to produce precise and accurate sub-national estimates. Small area estimation methods offer powerful tools in this context. Here, we produce regional estimates of three indicators measuring views of discrimination in employment of people from LGBTI communities in Europe. The analyses are based on the Eurobarometer 91.4 2019. Our empirical evidence shows that the estimates produced by small area estimation are reliable, giving important information to policy makers.
... In SAE literature, two types of diagnostics are used: the model diagnostics, and the diagnostics for the small area estimates [45]. The former are used to verify the assumptions of the underlying model, i.e., how well the working model performed when it is fitted to data. ...
... The small values of W = 5.68, 15.42, and 25.83 for FIP, FIG, and FIS respectively are significant at 5% level which indicates the consistency between the direct and the modelbased SAE results. See S1 File for further details.As is common for SAE studies, we conducted two types of diagnostics: (a) the model diagnostics, and (b) the diagnostics for the small area estimates[45]. The model diagnostics in S1Fig show that the normality assumptions were satisfied. ...
Article
Full-text available
Objective Food security is an important policy issue in India. As India recently ranked 107 th out of 121 countries in the 2022 Global Hunger Index, there is an urgent need to dissect, and gain insights into, such a major decline at the national level. However, the existing surveys, due to small sample sizes, cannot be used directly to produce reliable estimates at local administrative levels such as districts. Design The latest round of available data from the Household Consumer Expenditure Survey (HCES 2011–12) done by the National Sample Survey Office of India used stratified multi-stage random sampling with districts as strata, villages as first stage and households as second stage units. Setting Our Small Area Estimation approach estimated food insecurity prevalence, gap, and severity of each rural district of the Eastern Indo-Gangetic Plain (EIGP) region by modeling the HCES data, guided by local covariates from the 2011 Indian Population Census. Participants In HCES, 5915 (34429), 3310 (17534) and 3566 (15223) households (persons) were surveyed from the 71, 38 and 18 districts of the EIGP states of Uttar Pradesh, Bihar and West Bengal respectively. Results We estimated the district-specific food insecurity indicators, and mapped their local disparities over the EIGP region. By comparing food insecurity with indicators of climate vulnerability, poverty and crop diversity, we shortlisted the vulnerable districts in EIGP. Conclusions Our district-level estimates and maps can be effective for informed policy-making to build local resiliency and address systemic vulnerabilities where they matter most in the post-pandemic era. Advances Our study computed, for the Indian states in the EIGP region, the first area-level small area estimates of food insecurity as well as poverty over the past decade, and generated a ranked list of districts upon combining these data with measures of crop diversity and climatic vulnerability.
... In what follows, we described some standard diagnostic measures to examine the model assumptions and inspect the reliability and validity of the generated estimates through MFH method. In line with Brown et al. (2001), two forms of diagnostics viz. (a) the model diagnostics, and (b) the multivariate SAE diagnostics are employed to endorse the model assumptions. ...
... Next we evaluated the validity and the reliability of the small area estimates by some frequently used diagnostics. In line with Brown et al. (2001) and Chandra et al. (2011), model-based small area estimates should be (1) consistent with unbiased direct survey estimates and (2) more efficient than direct estimates in terms of MSE. The subsequent measures e.g., the bias diagnostic, goodness of fit (GoF) diagnostic, the percentage coefficient of variation (CV) and the 95 percentage confidence interval (CI) are selected. ...
Article
Full-text available
The earning inequality in India has unfavorably obstructed underprivileged in accessing elementary needs like health and education. Periodic labour force survey conducted by National Statistical Office of India generates estimates on earning status at national and state level for both rural and urban sectors separately. However, due to small sample size problem, these surveys cannot generate reliable estimates at micro-level viz. district or block. Thus, owing to unavailability of district-level estimates, analysis of earning inequality is restricted to the national and the state level. Therefore, the existing variability in disaggregate-level earning distribution often goes unnoticed. This article describes multivariate small area estimation method to generate precise and representative district-wise estimate of earning distribution in rural and urban areas of the Indian State of Bihar by linking Periodic labour force survey data of 2018-2019 and 2011 Population Census data of India. These disaggregate-level estimates and spatial mapping of earning distribution are essential for measuring and monitoring the goal of reduced inequalities related to the sustainable development of 2030 agenda. They expected to offer insightful information to decision-makers and policy experts for identifying the areas demanding more attention.
... Further, a set of diagnostics described previously [27,28] are also considered for assessing validity and reliability of the ...
Article
Full-text available
Background Local governments and other public health entities often need population health measures at the county or subcounty level for activities such as resource allocation and targeting public health interventions, among others. Information collected via national surveys alone cannot fill these needs. We propose a novel, two-step method for rescaling health survey data and creating small area estimates (SAEs) of smoking rates using a Behavioral Risk Factor Surveillance System survey administered in 2015 to participants living in Allegheny County, Pennsylvania, USA. Methods The first step consisted of a spatial microsimulation to rescale location of survey respondents from zip codes to tracts based on census population distributions by age, sex, race, and education. The rescaling allowed us, in the second step, to utilize available census tract-specific ancillary data on social vulnerability for small area estimation of local health risk using an area-level version of a logistic linear mixed model. To demonstrate this new two-step algorithm, we estimated the ever-smoking rate for the census tracts of Allegheny County. Results The ever-smoking rate was above 70% for two census tracts to the southeast of the city of Pittsburgh. Several tracts in the southern and eastern sections of Pittsburgh also had relatively high (> 65%) ever-smoking rates. Conclusions These SAEs may be used in local public health efforts to target interventions and educational resources aimed at reducing cigarette smoking. Further, our new two-step methodology may be extended to small area estimation for other locations and health outcomes.
Article
Full-text available
Bangladesh has experienced a rapid national decline in fertility in recent decades, however, fertility rates vary considerably at the sub-national level (i.e., division). These variations are expected to be more pronounced at lower levels of geography (e.g., district level). However, routinely conducted demographic health surveys are designed for national estimates and do not have adequate samples to produce reliable estimate of fertility rates at lower levels of administrative units, particular when considering district level age-specific fertility rates. Data extracted from the Bangladesh Demographic Health Survey 2014 are used to derive direct estimates of age-specific fertility rates and associated smoothed standard errors. These are used as inputs for developing a small area model, which is expressed in a hierarchical Bayesian framework and fitted by Markov Chain Monte Carlo simulation. The model accounts for variation at different levels-women age-group, division, and district. The modeling results show large reductions in the estimated standards errors and provide consistent estimates of fertility at the detailed district age-specific level. There are significant differences in the fertility levels within and between districts and at the division level. Fertility rates are observed to be higher for Sylhet division and for women aged 20-24 years. We use geo-spatial maps of the fertility rates to visualize the variations over districts, and identify hot and cold-spots to have better targeted local level planning and policy decision making for further reductions in fertility rates in Bangladesh.
Article
Full-text available
Health insurance is important in disease management, access to quality health care and attaining Universal Health Care. National and regional data on health insurance coverage needed for policy making is mostly obtained from household surveys; however, estimates at lower administrative units like at the county level in Kenya are highly variable due to small sample sizes. Small area estimation combines survey and census data using a model to increases the effective sample size and therefore provides more precise estimates. In this study we estimate the health insurance coverage for Kenyan counties using a binary M‑quantile small area model for women (n=14,730) and men (n=12,007) aged 15 to 49 years old. This has the advantage that we avoid specifying the distribution of the random effects and distributional robustness is automatically achieved. The response variable is derived from the Kenya Demographic and Health Survey 2014 and auxiliary data from the Kenya Population and Housing Census 2009. We estimate the mean squared error using an analytical approach based on Taylor series linearization. The national direct health insurance coverage estimates are 18% and 21% for women and men respectively. With the current health insurance schemes, coverage remains low across the 47 counties. These county-level estimates are helpful in formulating decentralized policies and funding models.
Article
A survey is typically designed to produce reliable estimates of target variables of the population at national and regional levels. For unplanned zones with small sample sizes, reliable estimates are needed in many ways but the direct survey estimates are unreliable. The purpose of the study is to improve the direct survey estimates of the z scores of malnutrition for unplanned zones by borrowing auxiliary variables from the census. We applied small area estimations under Fay Herriot (FH) model to overcome the problem of generating reliable estimates by linking the Ethiopian demographic and health survey (DHS) with the census data. According to the results of diagnostic measures, the FH model assumptions are satisfactorily confirmed. And also the results of model-based estimates confirmed that the EBLUPs of z scores of malnutrition are produced more reliable, efficient and precise estimates than the direct survey estimates for small sample sizes in all zones. Therefore, direct survey estimates of malnutrition were highly improved by the EBLUPs in all zones. Zones are important domains for planning and monitoring purposes in the country and therefore z scores of malnutrition estimates for under-five children at the zonal level can be helpful for resource allocation, policymakers, and planners
Combining unemployment benefits data and LFS data to estimate ILO unemployment for small areas: An application of a modified Fay-Herriot method
  • R Ambler
  • D Caplan
  • R Chambers
  • M Kovacevic
  • S Wang
Ambler, R., Caplan, D., Chambers, R., Kovacevic, M. and Wang, S. (2001), "Combining unemployment benefits data and LFS data to estimate ILO unemployment for small areas: An application of a modified Fay-Herriot method", Proceedings of the International Association of Survey Statisticians, Meeting of the International Statistical Institute, Seoul, August 2001.
Small Area Estimation Project
ONS (2001), "Small Area Estimation Project", unpublished report, London, U.K.: Office for National Statistics.