Reducing Socially Desirable Responses in Epidemiologic Surveys: An Extension of the Randomized-response Technique

Even though the validity of self-reports of sensitive behaviors is threatened by social desirability bias, interviews and questionnaires are widely used in epidemiologic surveys on these topics. In the randomized-response technique, a randomization device is used to determine whether participants are asked to respond truthfully or whether they are prompted to provide a prespecified response. In this study, the randomized-response technique was extended by using a cheating-detection modification to obtain more valid data. The survey was on the dental hygiene habits of Chinese college students. Whereas only 35% of men and 10% of women admitted to insufficient dental hygiene when questioned directly, 51% of men and 20% of women attested to this socially undesirable behavior in a randomized-response survey. Given the considerable discrepancy between the results obtained by direct questioning and by using the randomized-response technique, we propose that this technique be considered for use in epidemiologic studies of sensitive behaviors.


Reducing Socially Desirable Responses
in Epidemiologic Surveys
An Extension of the Randomized-response Technique
Morten Moshagen,
Jochen Musch,
Martin Ostapczuk,
and Zengmei Zhao
Background: Even though the validity of self-reports of sensitive
behaviors is threatened by social desirability bias, interviews and
questionnaires are widely used in epidemiologic surveys on these
Methods: In the randomized-response technique, a randomization
device is used to determine whether participants are asked to
respond truthfully or whether they are prompted to provide a
prespecified response. In this study, the randomized-response tech-
nique was extended by using a cheating-detection modification to
obtain more valid data. The survey was on the dental hygiene habits
of Chinese college students.
Results: Whereas only 35% of men and 10% of women admitted to
insufficient dental hygiene when questioned directly, 51% of men
and 20% of women attested to this socially undesirable behavior in
a randomized-response survey.
Conclusions: Given the considerable discrepancy between the re-
sults obtained by direct questioning and by using the randomized-
response technique, we propose that this technique be considered for
use in epidemiologic studies of sensitive behaviors.
Poor dental hygiene is a risk factor for dental diseases.
a survey of dental hygiene in China, 31% of 20 –29 year
olds admitted to brushing their teeth less than twice a day.
The validity of these estimates may be questioned, however,
because self-reported hygiene practices are likely to be dis-
torted owing to socially desirable responses.
studies comparing self-report data against gold standard mea-
sures have repeatedly shown over-reporting of desirable be-
haviors such as physical activity
and under-reporting of
undesirable behaviors such as drug use,
energy intake,
and sexual risk behavior.
Thus, there is reason to suspect
that these reported prevalence estimates of dental hygiene
habits may have been inflated by social desirability bias.
The randomized-response technique
was developed
to overcome this response bias by increasing the confidenti-
ality of responses. The basic idea is to add random noise to
the responses such that there is no direct link between a
participant’s response and his or her true status.
In the
forced-response variant
of this technique, a randomization
device (with known probability distribution) is used to deter-
mine whether participants are asked to respond truthfully or
whether they are prompted to provide a prespecified response
regardless of their true status. This procedure guarantees that
affirmative responses are no longer unequivocally linked to a
socially undesirable attribute and therefore no longer stigma-
tizing for the participants. Consequently, the randomized-
response technique encourages more honest responses and, in
turn, may provide more valid prevalence estimates of sensi-
tive issues, such as drug use
16 –18
and sexual behavior.
Despite its successful applications,
the random-
ized-response technique has been criticized as being suscep-
tible to respondents who are not answering as directed by the
randomization device.
The randomized-response technique
underestimates the prevalence of sensitive behaviors to the
extent that participants fail to comply with the instructions by
denying a sensitive attribute even when prompted to admit to
it by the randomization device.
Addressing this issue, Clark and Desharnais
a cheating-detection modification of the randomized-response
technique to explicitly assume that some respondents might
fail to comply with the instructions. The modification (Fig.)
divides the population into 3 distinct and disjoint groups. The
first group (
) consists of compliant respondents who hon-
estly admit being carriers of the sensitive attribute. The
second group (
) consists of compliant respondents who
truthfully deny the sensitive attribute. The third group (
) consists of noncompliant cheaters who do not
conform to the instructions by denying the sensitive attribute
irrespective of the randomization process. By symmetry,
there may also be respondents who are not carriers of the
sensitive attribute but claim it. However, we expect that such
a self-incriminating behavior is rare, and we, therefore, ig-
nore it in the model.
It is important to note that nothing is assumed re-
garding the true status of noncompliant respondents. It is
conceivable that these respondents deny a sensitive behav-
ior in which they have been engaged, but it is also possible
that innocent respondents want to rule out even the slight-
est suspicion and therefore deny that they committed an
undesirable act despite being told otherwise by the ran-
domization procedure. The estimated proportion of cheat-
ers can be used to compute an upper bound in a worst-case
scenario, which assumes that all noncompliant respondents
are, in fact, carriers of the sensitive attribute.
To explore the magnitude of response bias in self-
reported hygiene habits, the cheating-detection modification
was employed to investigate teeth-brushing behavior among
Chinese college students. In addition, the modification was
compared with an anonymous self-report measure to estimate
how much response bias can be reduced by this method.
A total of 2254 (55% women; aged 18 –24 years)
undergraduates from the University of Beijing, China, vol-
unteered to participate in this study. Students completed the
questionnaire during their regular classes.
Measures and Procedures
The participants completed an anonymous question-
naire comprising demographic information, several questions
not pertinent to this study, and the sensitive question: “Do
you brush your teeth at least twice a day?” The participants
were randomly assigned to 1 of 3 conditions. Two conditions
with different probabilities of being prompted to reply “no”
and P
) are required to make the cheating-detection
modification identifiable.
The participants’ month of birth
was used as the randomization device to keep the random-
ization procedure simple and transparent. In the low prob-
ability condition (P
:n900; 56% women), participants
born in January or February were instructed to reply “no”
independently of their true behavior, whereas participants
born in another month were prompted to reply truthfully.
In the high probability condition (P
:n891; 54%
women), participants born in January or February were
asked to respond truthfully, whereas the remaining partic-
ipants were prompted to reply “no.” According to birth
statistics provided by the National Bureau of Statistics of
China, the randomization probabilities P
and P
imated 0.17 and 0.83, respectively. In the direct question-
ing condition (n 463, 54% women), participants were
simply asked to respond truthfully.
Statistical Analysis
Closed-form solutions
for parameter estimation in the
cheating-detection modification do not allow a statistical
comparison of subgroups. We, therefore, conducted our anal-
ysis within the more general framework of multinomial
By converting the nonbinary tree model into a
statistically equivalent binary tree representation (for details,
see Ostapczuk et al
), established statistical procedures of
multinomial modeling can be used to estimate the parameters
and to test restrictions on them. Parameter estimates were
obtained by minimizing the asymptotically
log-likelihood ratio statistic G
using the EM-algorithm.
There were sizeable differences in the proportion of
men (35%; SE 3.3) and women (10%; SE 1.9) admitting
insufficient teeth brushing behavior with direct questioning.
The cheating-detection modification was therefore estimated
separately by sex (Table).
FIGURE. A multinomial representa-
tion of the cheating detection variant
of the randomized response tech-
nique. To make the model identifi-
able, 2 independent random samples
are questioned with different proba-
bilities of being prompted to reply
“no” (P
and P
Using the cheating-detection modification,
51% (SE 3.2) of men and
20% (SE 2.7) of
women reported insufficient dental hygiene. The estimates
were considerably higher than the estimates with direct
questioning for both men and women, indicating substan-
tial under-reporting with direct questioning. Moreover, a
substantial proportion of noncompliance with the instruc-
tions was observed for both men (
10.1%; SE 2.4)
and women (
13.0%; SE 2.5).
Depending on whether noncompliant respondents were
considered to have engaged in insufficient teeth brushing, the
lower-bound estimate for the proportion admitting to insufficient
teeth brushing was
51% for men and
20% for
women; the respective upper-bound estimate was
62% for men and
32% for women.
Survey data may reflect what respondents want to tell
the investigator, rather than their actual behavior. We used a
cheating detection modification of the randomized-response-
technique to improve the validity of response data on dental
hygiene habits in a sample of Chinese college students.
Consistent with previous studies, only 35% of men and 10%
of women reported insufficient dental hygiene habits when
questioned directly. When the cheating-detection modifica-
tion was employed, however, the proportions increased con-
siderably for men, and almost doubled for women. Assuming
that all noncompliant respondents in fact brushed their teeth
less than twice a day, the upper-bound prevalence estimate of
insufficient dental hygiene habits in the present sample was
62% for men and 32% for women. Prevalence estimates of
dental hygiene habits may also be positively biased in other
populations. More generally, direct questioning may provide
strongly distorted prevalence estimates in surveys of socially
undesirable behavior. The same is also true, however, for
traditional variants of the randomized-response technique not
capable of detecting cheating, because the prevalence of a
sensitive attribute is underestimated to the extent there is
noncompliance with instructions.
Several limitations should be considered. First, ran-
domized-response models introduce random error and induce
greater sampling variance. The randomized-response tech-
nique, therefore, requires considerably larger samples than a
direct question. This loss of efficiency is outweighed by a
gain in precision only when the attribute under investigation
is sufficiently sensitive. Second, the randomized-response
technique is more complicated to administer because the
respondents have to understand how the randomized-re-
sponse technique protects their privacy.
Although the ran-
domized-response technique has been successfully used with
older and less educated respondents, noncompliance rates tend
to increase in such populations.
Finally, as the true
status of any individual remains unknown, it is difficult to
compute measures of association between an randomized-re-
sponse-technique-variable and other variables of interest.
34 –36
Such limitations notwithstanding, the cheating-detection modi-
fication provides a means to improve prevalence estimates of
sensitive behaviors, and may be useful in epidemiologic surveys
of sensitive behaviors.
