Article

An Empirical Validation Study of Popular Survey Methodologies for Sensitive Questions

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

When studying sensitive issues, including corruption, prejudice, and sexual behavior, researchers have increasingly relied upon indirect questioning techniques to mitigate such known problems of direct survey questions as underreporting and nonresponse. However, there have been surprisingly few empirical validation studies of these indirect techniques because the information required to verify the resulting estimates is often difficult to access. This article reports findings from the first comprehensive validation study of indirect methods. We estimate whether people voted for an anti-abortion referendum held during the 2011 Mississippi General Election using direct questioning and three popular indirect methods: list experiment, endorsement experiment, and randomized response. We then validate these estimates against the official election outcome. While direct questioning leads to significant underestimation of sensitive votes against the referendum, indirect survey techniques yield estimates much closer to the actual vote count, with endorsement experiment and randomized response yielding the least bias.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This section considers evidence of potential bias in survey responses on a range of topics and methods to address this. Estimates of attitudes about sensitive topics, such as immigration, racism and abortion, vary depending on the level of anonymity respondents are provided Glynn, 2013;Kulinski et al., 1997;Rosenfeld et al., 2016). Reported prevalence of sensitive behaviours, such as plagiarism, shoplifting and sexual activity, also varies (Chuang et al., 2021;Coutts et al., 2011;Krumpal, 2013). ...
... The study showed high levels of respondents not answering as instructed by the randomisation device, which the authors infer as 'cheating' and hence some evidence of social desirability bias. However, others have observed that the Randomised Response Technique can lead to high non-response rates due difficulty understanding the instructions (Rosenfeld et al., 2016). ...
... Like the Randomised Response Technique, list experiments provide respondents with 'permanent' anonymity by not asking them directly for their response (Chuang et al., 2021). 7 However, it is simpler to employ and has lower non-response rates (Rosenfeld et al., 2016). As it does not yet appear to have been used in relation to attitudes or behaviours towards disability, in the following section we explain its logic and explore its use in relation to other issues. ...
Technical Report
Full-text available
Despite the right of disabled people to full social and economic inclusion, many face multiple day-to-day and systemic challenges. These include but are not limited to additional expenses, access to housing, and everyday accessibility difficulties. Surveys show the general public hold positive attitudes towards policies that seek to enable disabled people to overcome these challenges, but standard survey methods are susceptible to response biases that may overestimate this support. This study aimed to test whether two such biases influence support for disability policy in Ireland: social desirability bias (i.e. the tendency for survey respondents to alter their responses in order to present themselves in a positive light); and inattention to the implications of policy support (e.g. that welfare policies require funding). Together the survey experiments covered a range of policy issues and types of disability, as identified in previous research and in consultation with the disability advisory group for the project. A nationally representative sample of 2,000 adults took part in the online study. One stage of the study used list experiments to test for social desirability bias in responses to three issues: (1) support for increased social welfare for disabled people, (2) support for prioritising disabled people for social housing and (3) how many people admit to parking in a disabled parking space without a permit. In each list experiment, participants were assigned at random to one of two groups. One (‘control’) group was presented with a list of items unrelated to the topic of interest (in this case, disability policy) and asked how many they agree with. The other (‘treatment’) group was presented with the same list but with the addition of an item about the topic of interest. Any difference in the average response between the groups can be attributed to the added item and gives an indication of support for that item when participants are provided full anonymity (because they are never asked directly about their support for that item). Allowing participants to respond anonymously minimises the influence of the desire to be viewed positively by others on responses. Another stage of the study tested the influence of question detail on policy support. The policies in this stage related to (1) increased cost of living support for disabled people, (2) support for children with disabilities and (3) support for building wheelchair accessible infrastructure. Participants were randomly allocated to a group that was asked for support for a policy without any specified funding mechanism, or to a group that was asked about support for the same policy but with the funding mechanism specified, for example that the policy would be funded through a budget reallocation or a tax increase. The study shows that while the majority of people in Ireland support most policies that aim to enable disabled people to participate fully in society, standard surveys are likely to lead to inaccurate estimates of support. Approximately one-in-seven people are estimated to express support for some policies when asked directly but not when allowed to respond anonymously, with a similar change in support when funding mechanisms or policy trade-offs are made explicit. Support is stronger among those more familiar with disability issues, although further research is required to understand why. If those familiar with disability simply better understand the challenges associated with disability, this implies that enhancing public understanding of the challenges and costs of disability would strengthen support. If it is because they know someone who will directly benefit from the policy, further research on how people understand and recognise disability among people in their social networks may help. Complementing standard surveys with reliable experimental methods is recommended to avoid misperceptions of support for disabled people and to identify where potentially negative attitudes may need to be challenged.
... In order to do this, we made a significant amount of work to ensure that the survey was suitable for its intended audience as well as its primary purpose, which was to encourage respondents to participate in the survey to a larger level (Dillman et al., 2014;Rosenfeld et al., 2015)., we made sure that the responses were collected in a secure environment by having the participants fill out an online survey that they were responsible for administering themselves, compensating them for their time with free books and participation certificates, and making it very clear to the participants what the goals of the study were. These steps ensured that the responses were collected safely (Rosenfeld et al., 2015). ...
... In order to do this, we made a significant amount of work to ensure that the survey was suitable for its intended audience as well as its primary purpose, which was to encourage respondents to participate in the survey to a larger level (Dillman et al., 2014;Rosenfeld et al., 2015)., we made sure that the responses were collected in a secure environment by having the participants fill out an online survey that they were responsible for administering themselves, compensating them for their time with free books and participation certificates, and making it very clear to the participants what the goals of the study were. These steps ensured that the responses were collected safely (Rosenfeld et al., 2015). ...
Article
This research was conducted to collect information that may be used in arguing for the continuance of creative writing programmes in Chilean public schools, namely for grades seven through twelve. When attempting to improve the current educational system, it is essential to consider the views held by educators since these convictions influence how they evaluate the content of the classes they teach. As a consequence of this, the conceptual frameworks that teachers subscribe to have an impact on the way that they perceive the curriculum. Consequently, we arrived at the realisation that it would be beneficial to conduct research on the connections between the current implementation of the national curriculum by language teachers and their perceptions of the five accessible instructional paradigms for language learning. This led us to conclude that conducting this research would be beneficial. Even though it is very likely that attitudes regarding the instruction of writing are derived from concepts about the more general subject of language learning, we did consider this more general category. The results of our investigation, including its parts, are now accessible to the public after 182 teachers from around Chile responded to our questionnaire (response rate: 47 per cent). Although the instructors indicated that four of the other curricular paradigms had exceptionally high levels of adherence in terms of practices and beliefs, the communicative paradigm, the fifth curricular paradigm, showed a low degree of adherence. This is even though the communicative paradigm is the fifth curricular paradigm. This was the situation regardless of the communicative paradigm being the sixth curricular paradigm. This particular paradigm is meant when people talk about the "curriculum paradigm." The sixth iteration of the curricular paradigm, known as the communicative paradigm, ultimately turned out to be quite similar to the previous iterations. It would appear that the instructors' perspectives on writing and other, more general issues were related to how successfully they could implement their instructional strategies for writing education in the classes they were responsible for instructing. This was the case because the instructors' ability to do so directly impacted the students' learning experiences. This was the case for several other, more fundamental features and their various perspectives on the process of writing. Based on the findings, public authorities should concentrate on enhancing the instructors' perspectives on writing instruction, particularly communicative writing instruction, and on the linkages between the five paradigms to improve writing education. This was found to be the case in order to improve writing education. In order to enhance instruction about writing. This is because the study indicates that these are the spheres in which the general people should concentrate most of their efforts. We think that efforts to improve education need to be concentrated on years 9–12 rather than grades 7-8 as a matter of public policy and the suggestions we offer to those in charge of making decisions. Regarding this matter, we have quite staunch opinions.
... Second, we made sure that our questionnaire was as brief and simple to interpret as possible to minimize the costs incurred by our participants. Additionally, we ensured that it was interesting enough for people to respond to, properly targeted its demographic, and had a clear goal (Dillman, 2000;Dillman et al., 2014;Rosenfeld et al., 2015). (Dillman, 2000;Dillman et al., 2014;Rosenfeld et al., 2015). ...
... Additionally, we ensured that it was interesting enough for people to respond to, properly targeted its demographic, and had a clear goal (Dillman, 2000;Dillman et al., 2014;Rosenfeld et al., 2015). (Dillman, 2000;Dillman et al., 2014;Rosenfeld et al., 2015). Additionally, we wanted to reduce the possibility that our respondents would get bored or worn out, so we used the online platform that was made available to us to ask them a variety of topics. ...
Article
Since the year 2010, the country of Chile has been operating programs with the goals of fostering the development of students' writing abilities and encouraging them to engage in more frequent writing practice. These programs are intended to improve the educational experiences of students in order to better prepare them for active participation as contributing citizens of today's society. However, to affect future changes in the educational levels covered by grades 7 through 12, further understanding the present ways to teach writing at these grade levels is necessary. These levels of education span middle school, high school, and college. These grade levels encompass those of high school and college as well as middle school and college. In addition, there is a seventh grade up to a twelfth grade included in the educational levels. By focusing on the environments in which the various models of writing instruction are implemented, we attempted to construct a picture of the many paradigms of writing instruction that are now being employed in Chilean public schools for students who are enrolled in grades 7 through 12. This was done to gain a deeper comprehension of Chile's current writing teaching condition. In addition, this was done to illustrate the myriad of meanings derived from these paradigms through examples. In order to do this, we surveyed a total sample size of 182 Spanish language instructors from various locations around Chile. The findings indicated that writing teachers, on the whole, had a positive attitude toward the linguistic, cultural, and procedural paradigms that underpin writing instruction. On the other hand, it was revealed that the communicative paradigm was not as cohesive and did not have as strong of a support base as the other two paradigms. We recommend that in the future, government programs place a greater emphasis on reducing the number of work teachers are required to complete and providing them with the assistance they require in the classroom, particularly for those teachers who are in charge of educating students in grades 9 through 12. By detailing the most up-to-date practices of Chilean educators and how those practices connect to the setting in which they are applied, this research can serve as a source of guidance for the international community of literacy scholars. In a nutshell, the study will explain how the practices connect to the environment in which they are implemented. Consequently, this may help to create a more in-depth understanding of the criteria for a high-quality writing education worldwide.
... There is robust evidence that people underreport the truth about sensitive topics when they are directly asked in a survey. Such research compares self-reported survey answers to observed or independently verifiable outcomes such as tax records, receipt of welfare benefits, drug prescriptions for mental illness, and voting in an abortion law referendum (Gottschalk and Huynh 2010;Rosenfeld, Imai, and Shapiro 2016;Bharadwaj, Pai, and Suziedelyte 2017;Murray-Close and Heggeness 2018;Meyer and Mittag 2019). However, reliable, unbiased observational data are rare for many sensitive topics, including IPV. 5 This is particularly true in developing countries, where there is limited administrative data from the medical or legal services that IPV survivors may use. ...
... This assumption is supported by empirical evidence from the only study we identified that compares an observed sensitive behavior with self-reported behavior measured using direct and indirect methods including the list experiment. Rosenfeld, Imai, and Shapiro (2016) found that when they were asked directly, people underreported their past voting behavior on a sensitive abortion referendum in Mississippi. However, rates were higher and closer to truth when they were asked using the list experiment. ...
Article
Full-text available
This paper analyzes the magnitude and predictors of misreporting on intimate partner violence. Women in Nigeria were randomly assigned to answer questions using either an indirect method (list experiment) that gives respondents anonymity, or the standard, direct face-to-face method. Intimate partner violence rates were up to 35 percent greater when measured using the list method than the direct method. Misreporting was associated with indicators often targeted in empowerment and development programs, such as education and vulnerability. These results suggest that standard survey methods may generate significant underestimates of the prevalence of intimate partner violence, and biased correlations and treatment effect estimates.
... Meta-analyses have demonstrated that respondents are more likely to report sensitive behaviours or attitudes in list experiments and RRT compared to direct questioning, which is susceptible to sensitivity bias (Lensvelt-Mulders et al., 2005;Ehler, Wolter and Junkermann, 2021;Li and van den Noortgate, 2022. But also see; Rosenfeld, Imai and Shapiro, 2016;Blair, Coppock and Moor, 2020). While both list experiments and RRT offer advantages over direct questioning, research indicates that RRT is associated with higher rates of respondent misunderstanding and non-response compared to list experiments (Holbrook and Krosnick, 2010a;Coutts and Jann, 2011). ...
Article
Discrimination is one of the largest barriers that immigrants and racial/ethnic minorities face in contemporary society. Social scientists have developed and applied field experimental methods to detect the existence and prevalence of discrimination in various domains. In addition, researchers have utilized questionnaires to directly ask discrimination victims about their experiences and the frequency of discrimination they encounter. However, self-reports of discrimination may be biased due to judgment errors in attributing mistreatment to discrimination and intentional overreporting (vigilance) or underreporting (minimization) of discrimination. In this study, we propose a two-stage model that distinguishes between these judgment and reporting biases. We argue that vigilance and minimization stem from sensitivity concerns. We conducted a list experiment with African American respondents who asked about their experiences of employment and everyday discrimination. Comparing the list experiment and direct question estimates, we find no evidence of systematic underreporting or overreporting of employment discrimination. For everyday discrimination, we find overreporting concentrated among ideologically liberal African Americans. These results provide new insights into biases in self-reported discrimination and suggest researchers should be attentive to the conditions under which these biases arise.
... Four control items, one each of ample and scarce prevalence and two mutually exclusive ones, are recommended to prevent respondents from considering all items applicable (ceiling), or none (floor), situations that would compromise perceived anonymity (Kuklinski et al., 1997;Blair & Imai, 2012); sensitive controls should be avoided if possible (Droitcour et al., 1991;Ehler et al., 2021). ICT is generally rated as preferable to other unobtrusive survey pro-cedures such as randomized response technique, which guarantees privacy by requesting a score for either the sensitive item or an unrelated one, for example -petitions that might confuse or even irritate some participants (Coutts & Jann, 2011;Hox & Lensvelt-Mulders, 2008;Rosenfeld et al., 2016;Wolter & Diekmann, 2021). Although list experiments are comparatively straightforward, a growing number of papers have voiced concerns about various kinds of non-strategic response error and ensuing instability (Tsuchiya & Hirai, 2010;Kiewiet de Jonge & Nickerson, 2014;Ahlquist, 2018;Gosen et al., 2019;Kramon & Weghorst, 2019;Jerke et al., 2019;Ehler et al., 2021;Kuhn & Vivyan, 2021;Riambau & Ostwald 2021;Jerke et al., 2022). ...
Article
Full-text available
This Research Note reports on a list experiment regarding anti-immigrant sentiment (n=1,965) that was fielded in Spain in 2020. Among participants with left-of-center ideology , the experiment originated a negative difference-in-means between treatment and control. Drawing on Zigerell's (2011) deflation hypothesis, we assess the possibility that leftist treatment group respondents may have altered their scores by more than one to distance themselves unmistakably from the sensitive item. We consider this possibility plausible in a context of intense polarization where immigration attitudes are closely associated with political ideology. This study's data speak to the results of recent meta-analyses that have revealed list-experiments to fail when applied to prejudiced attitudes and other highly sensitive issues-i.e., precisely the kind of issues with regard to which the technique ought to work best. We conclude that the possibility of strategic response error in specific respondent categories needs to be considered when staging and interpreting list experiments.
... This is especially pronounced in authoritarian settings, wherein politics is inherently less transparent compared to democratic settings. Recent research suggests that while a direct question can be considerably biased, such bias can be mitigated through the use of indirect questioning techniques (Rosenfeld et al., 2013). To identify the importance of superior endorsement, we adopted indirect questioning methods. ...
Article
Full-text available
The influence of politics in policy implementation is a widespread global phenomenon, but bureaucratic responses to it remain understudied. This study examines how superior endorsement affects local officials’ compliance patterns with higher authorities’ administrative directives for regulating air pollution in China. Despite China’s stance on aligning environmental protection with socioeconomic development, we point out that superior endorsement might incentivize subordinates to downplay central policy intentions and fall in line with superior governments’ policy priorities through blunt measures. Drawing from an original dataset of Chinese officials, we find that local officials who acknowledge the importance of superior endorsement prefer to fulfill priority tasks of pollution regulation by shutting down polluting enterprises, even at high social and economic costs. However, the effect of superior endorsement is not statistically significant for officials who work in Party organizations and higher-level governments. Our results suggest that the prevalence of political control by superiors may enhance local policy effectiveness at the cost of diverging from institutionalized rule-based policy implementation.
... There are several other techniques for measuring sensitive questions. Rosenfeld et al. (2016) evaluate list experiments alongside other techniques, including indirect questioning techniques, endorsement experiments, and randomized response techniques (see also Blair et al. 2015). Researchers need to consider the trade-offs of the various approaches. ...
Conference Paper
Full-text available
Collecting public opinion data is challenging in the shadow of war. And yet accurate public opinion is crucial. Political elites rely on it and often attempt to influence it. Therefore, it is incumbent on researchers to provide independent and reliable wartime polls. However, surveying in wartime presents a distinctive set of challenges. We outline two challenges facing polling in war: under-coverage and response bias. We highlight these challenges in the context of the Russia-Ukraine war, drawing on original panel survey data tracing the attitudes of the same people prior to and after Russia's full-scale invasion of Ukraine in 2022. We conclude with some lessons for those employing survey methods in wartime, and point to steps forward, in Ukraine and beyond.
... There are several other techniques for measuring sensitive questions. Rosenfeld et al. (2016) evaluate list experiments alongside other techniques, including indirect questioning techniques, endorsement experiments, and randomized response techniques (see also Blair et al. 2015). Researchers need to consider the trade-offs of the various approaches. ...
... The fundamental effect of SDB is the masking of the true response and thus poses a serious threat to the validity of the findings (Tourangeau et al. (2007) and Schill and Kirk (2017)). This remains noticeably relatable phenomenon in almost every field of social research, for more understanding of the pervasive nature of SDB see also Krumpal (2013), Rosenfeld et al. (2015), Hussain et al. (2019) and Vesely and Klöckner (2020). Figure 1 aims at the comprehension of the above documented discussion. ...
Chapter
Chapter 7, by Salman A. Cheema, Irene L. Hudson, and colleagues, dealswith social desirability bias while studying socially stigmatized behaviors. Thestudy deals with the situation where data have already been collected, and aninitial analysis reveals the patterns pointing towards the existence of socialdesirability bias. Authors have demonstrated the applicability of the proposedmodel by studying the contraceptive behaviors and their deriving factors in amulti-linguistic, culturally diverse and relatively more rigid society. (9) (PDF) Handbook of Research on Cultural and Cross-Cultural Psychology Cognitive Science and Psychology. Available from: https://www.researchgate.net/publication/373976370_Handbook_of_Research_on_Cultural_and_Cross-Cultural_Psychology_Cognitive_Science_and_Psychology [accessed Dec 05 2023].
... Bullock, Imai and Shapiro 2011;Blair, Imai, and Lyall 2014). Comparisons of survey methodologies indicate the advantages of endorsement approaches over others, including list approaches (Rosenfeld, Imai, and Shapiro 2015). There are different reasons why respondents may worry about expressing their true opinion of President Putin in the 'Near Abroad' -and this worry may be stronger in some countries than in others. ...
... Second, while we rely on a novel and credible source of data about municipal government corruption-the street-level bureaucrats that work alongside those officials-the information analyzed here remains reported corruption rather than directly observed incidents of corrupt behavior. Despite the many challenges, we encourage innovative work to corroborate reported data with additional information about instances of corruption, such as survey-based approaches with randomized prompts (Rosenfeld, Imai, and Shapiro 2016) and study designs that intentionally combine qualitative evidence with representative surveys (Gofen, Meza, and Pérez-Chiqués 2022). Additionally, while the study's key independent variables about administrative form are not survey-based like the corruption measure, the indicators for the three moderating conditions are. ...
Article
Decentralization reform has both advantages and risks. Bringing service delivery “closer to the people” can improve information flows and strengthen accountability, but it may also leave systems vulnerable to elite capture and corruption by municipal government officials. While past research has acknowledged the possibility of corruption under decentralization, relatively little work has connected those risks to features of these reforms or specific local institutional arrangements. To explore the conditions that can help mitigate the risks of corruption under decentralization, we study the case of health sector reform in Honduras where municipal governments, associations, and NGOs each serve as intermediary-managing organizations under a common decentralized health service delivery model. We argue that three types of institutional arrangements reflecting local accountability practices serve as checks on the authority granted through decentralization and can help guard against corruption: external supervision, civil society engagement, and public participation. Empirically, we draw on data from more than 600 street-level bureaucrats, valuable but under-utilized informants about municipal corruption, across a matched sample of 65 municipalities with contrasting forms of administration. We find that reported corruption is highest under decentralization led by municipal governments, as compared to association- or NGO-led varieties. Both external supervision and civil society engagement help attenuate the positive association between decentralization and corruption, but public participation does not. Overall, this research highlights the importance of considering reform features and local conditions when designing policies to help manage risks and support effective social sector decentralization.
... In traditional questionnaires, sensitive or direct questions related to personal matters (e.g., salary issues, working atmosphere) might induce a significant bias in responses that cannot be prevented even by guaranteeing anonymity (Krause & Wahl, 2022, Rosenfeld et al., 2016. In contrast, indirect questions increase the interviewees' confidence and facilitate the collection of reliable information, allowing the researcher to accurately evaluate the phenomenon under investigation. ...
... This is because of the bias-variance tradeoff. A validation study shows that, compared to direct questioning, list experiments produce estimates closer to the true prevalence, albeit with wider confidence intervals (Rosenfeld, Imai, and Shapiro 2015). ...
Article
Full-text available
Social scientists use list experiments in surveys to estimate the prevalence of sensitive attitudes and behaviors in a population of interest. However, the cumulative evidence suggests that the list experiment estimator is underpowered to capture the extent of sensitivity bias in common applications. The literature suggests double list experiments (DLEs) as an alternative to improve along the bias-variance frontier. This variant of the research design brings the additional burden of justifying the list experiment identification assumptions in both lists, which raises concerns over the validity of DLE estimates. To overcome this difficulty, this paper outlines two statistical tests to detect strategic misreporting that follows from violations to the identification assumptions. I illustrate their implementation with data from a study on support toward anti-immigration organizations in California and explore their properties via simulation.
... The researchers like Gonzalez-Ocanto set al. (2012) and Lyall et al. (2013) discovered that to avoid such errors; the technique of indirect questioning is the right method to resolve this. The indirect questioning method can be vital to eliminate biasness as compared to direct questions (Rosenfeld et al., 2016;Van der Heijden et al., 2000). ...
Article
Full-text available
The judgment of parameters about the populace is significant for drawing a sample from the population under study in the survey method. Innumerable statisticians introduced numerous estimators to make predictions about the parameters in a population with the application of auxiliary information for sensitive variables. In the current investigation, the researchers tried to depict the general parameter estimate for sensitive variables using randomized response models. The survey was the method in this paper, and a simple random without replacement (SRSWOR) was utilized to gather the sample. Overall, it presented the general ratio and exponential ratio of estimations for the sensitive variable using non-sensitive AV founded on an RRT. The biasness and MSE expressions above second category calculations appeared as outcomes. Many empirical works are replicated to prove the performance of projected estimators for the sensitive variables for the population under study. This proven model will benefit other researchers and statisticians working in the statistics field or data collection, for instance, population census, to take forward it and develop more advanced statistical general parameters, and also for advanced investigations.
... CM has been shown to perform better than list experiments (Jerke et al., 2022) and RRT (Höglinger and Jann, 2018). RRT has been shown to outperform list experiments in one study (Rosenfeld et al., 2016) but perform less well than list experiments in another study (Coutts and Jann, 2011). ...
Technical Report
Full-text available
Gambling is a large and growing industry. With that growth, there has also been growing concern about the potential harms that can arise from problem gambling. In late 2022, new legislation was introduced in Ireland to provide for more stringent regulation of the gambling industry and to establish an independent regulator, the Gambling Regulatory Authority of Ireland (GRAI). This review summarises and evaluates evidence from international research that is relevant to a number of policy questions. In doing so, it also identifies where evidence is deficient or lacking, to highlight some important and fruitful avenues for future research.
... List experiments are a method to quantify and mitigate such misclassification by allowing respondents only to report in an aggregate form without revealing their answers to specific items. Studies have shown that respondents are more likely to truly answer the questions when asked using the list experiment compared to the direct question [34]. List experiment has been used to measure risky sexual behavior [21] or abortion [23]. ...
Article
Background: Self-report of sensitive sexual behaviors is often inaccurate and subject to social desirability bias. List experiment is an alternative survey method to mitigate biases. The objective of this study was to estimate the rate of sexually transmitted infections (STIs) among older adults in urban Tanzania using the list experiment and to compare it with the estimate from direct questioning. Methods: The study was nested within the Dar es Salaam Urban Cohort Study, the Health and Demographic Surveillance System (HDSS) in Ukonga, Tanzania. Men and women aged ≥40 years were randomly assigned to receive a list of either four control items (i.e., control group), or four control items plus an additional item on having had a disease through sexual contracts in the past 12 months (i.e., treatment group). We calculated the mean difference in the total number of items to which respondents responded "yes" in the treatment vs. control groups, and compared it to the proportion measured in direct questioning. Multivariate linear and non-linear regression models were also fitted. Results: A total of 2310 adults aged ≥40 years was enrolled in the study: 32% were male, and 48% were aged 40-49 years. The estimated prevalence of having sexually transmitted infections (STIs) in the past 12 months was 17.8% (95% Confidence Interval [CI] 12.3-23.3) in the list experiment, almost 10 times higher compared to 1.8% (95%CI 1.3-2.4) when directly asked (p<0.001). The prevalence was higher in men (27.0%; 95%CI 17.0-37.0) than in women (12.9%; 95%CI 6.4-19.4). The prevalence remained high after adjusting for age, multiple lifetime partners, and other sociodemographic factors in multivariate linear regression (15.6%; 95%CI 7.3-23.9). Conclusions: The study found a higher estimated prevalence of STIs among older adults in urban Tanzania using the list experiment than when directly asked. This highlights the need for screening of STIs including HIV to ensure effective prevention and treatment in older adults.
... Despite their widespread use, the ability of RRTs to reduce bias and increase response accuracy when discussing sensitive topics is unclear (Ibbett et al., 2021;Lensvelt-Mulders, Hox, van der Heijden, & Maas, 2005;Umesh & Peterson, 1991). While some comparative studies suggest RRTs produce higher, and presumably more accurate estimates than conventional methods such as direct questions (e.g., Carvalho, 2019;Cerri et al., 2017), evidence from the few validation studies that exist, suggest RRTs often underestimate prevalence (Bova et al., 2018;Lensvelt-Mulders, Hox, van der Heijden, & Maas, 2005;Rosenfeld et al., 2016). Some researchers suggest that the method confuses respondents (Razafimanahaka et al., 2012), and uncertainty remains as to how instruction comprehension and topic sensitivity influence respondents' propensity to answer accurately, potentially introducing other forms of error. ...
Article
Full-text available
Abstract To develop more effective interventions, conservationists require robust information about the proportion of people who break conservation rules (such as those relating to protected species, or protected area legislation). Developed to obtain more accurate estimates of sensitive behaviors, including rule‐breaking, specialized questioning techniques such as Randomized Response Techniques (RRTs) are increasingly applied in conservation, but with mixed evidence of their effectiveness. We use a forced‐response RRT to estimate the prevalence of five rule‐breaking behaviors in communities living around the Ruaha–Rungwa ecosystem in Tanzania. Prevalence estimates obtained for all behaviors were negative or did not differ significantly from zero, suggesting the RRT did not work as expected and that respondents felt inadequately protected. To investigate, we carried out a second study to explore how topic sensitivity influenced respondents' propensity to follow RRT instructions. Results from this experimental study revealed respondents understood instructions well (~88% of responses were correct) but that propensity to follow RRT instructions was significantly influenced by the behavior asked about, and the type of answer they were required to provide. Our two studies highlight that even if RRTs are well understood by respondents, where topics are sensitive and respondents are wary of researchers, their use does not necessarily encourage more honest responding.
... RR admits several variants, which we discuss later in the paper. Provided that people comply with the instructions of the surveyor, RR offers plausible deniability: a recorded response "Yes" may be due to the fact that the dice landed on 1 or 2. The empirical literature on survey design for sensitive questions has found that RR performs better than DE, at least in single shot, large scale surveys (Rosenfeld et al., 2016). ...
... Articles that studied abortion as one among several topics also studied "morally controversial" issues (Elías et al. 2017), the electoral implications of abortion (Glaeser, Ponzetto, and Shapiro 2005;Washington 2008), or contraception (Bailey 2010). Articles published in the three top political science journals that focused primarily on abortion evaluated judicial decision-making and legitimacy (Caldarone, Canes-Wrone, and Clark 2009;Zink, Spriggs, and Scott 2009) or public opinion (Kalla, Levine, and Broockman 2022;Rosenfeld, Imai, and Shapiro 2016). More commonly, abortion was one of several (or many) different issues analyzed, including government spending and provision of services, government help for African Americans, law enforcement, health care, education, free speech, Hatch Act restrictions, and the Clinton impeachment. ...
Article
Abortion is central to the Amerian political landscape and a common pregnancy outcome, yet research on abortion has been siloed and marginalized in the social sciences: in an empirical analysis, we find only 22 articles published in this century in the top economics, political science, and sociology journals. This special issue aims to bring abortion research into a more generalist space, challenging what we term the “abortion research paradox” wherein abortion research is largely absent from prominent disciplinary social science journals but flourishes in interdisciplinary and specialized journals. After discussing the misconceptions that likely contribute to abortion research siloization and the implications of this siloization on abortion research as well as social science knowledge more generally, this essay introduces the articles in this special issue. Then, in a call for continued and expanded research on abortion, this essay closes by offering three guiding practices for abortion scholars—both those new to the topic and those already deeply familiar—in the hopes of building an ever-richer body of literature on abortion politics, policy, and law. The need for such a robust literature is especially acute following the United States Supreme Court’s June 2022 overturning of the constitutional right to abortion.
... For instance, while studies of how individuals react to CDR show little support for the MH/MD hypothesis, studies of how individuals think others react tend to show support. This divergence could be due to MH/MD being a less virtuous characteristic, leading people to underreport it to researchers when asked directly (Rosenfeld et al., 2016). We also observe that conclusions on how individuals assess arguments about MH/MD seem sensitive to the precise framing used in studies, and that there is reason to be cautious about the interpretation of responses. ...
Article
Full-text available
Carbon dioxide removal is rapidly becoming a key focus in climate research and politics. This is raising concerns of “moral hazard” or “mitigation deterrence,” that is, the risk that promises of and/or efforts to pursue carbon removal end up reducing or delaying near‐term mitigation efforts. Some, however, contest this risk, arguing that it is overstated or lacking evidence. In this review, we explore the reasons behind the disagreement in the literature. We unpack the different ways in which moral hazard/mitigation deterrence (MH/MD) is conceptualized and examine how these conceptualizations inform assessments of MH/MD risks. We find that MH/MD is a commonly recognized feature of modeled mitigation pathways but that conclusions as to the real‐world existence of MH/MD diverge on individualistic versus structural approaches to examining it. Individualistic approaches favor narrow conceptualizations of MH/MD, which tend to exclude the wider political‐economic contexts in which carbon removal emerges. This exclusion limits the value and relevance of such approaches. We argue for a broader understanding of what counts as evidence of delaying practices and propose a research agenda that complements theoretical accounts of MH/MD with empirical studies of the political‐economic structures that may drive mitigation deterrence dynamics. This article is categorized under: The Carbon Economy and Climate Mitigation > Benefits of Mitigation The Social Status of Climate Change Knowledge > Sociology/Anthropology of Climate Knowledge Policy and Governance > Multilevel and Transnational Climate Change Governance
... We must think carefully about what research designs are most suitable in response to the Kremlin's increasing criminalization of dissent (Frye et al. 2022). Popular indirect questioning techniques such as list experiments may help to increase the candor of survey responses (Rosenfeld, Imai, and Shapiro 2016), but other recommendations of techniques old and new also follow. ...
Article
Amid ongoing uncertainty, regular surveying in Russia continues to date and collaborations with Western academics have too. These developments offer some basis for cautious optimism. Yet they also raise critical questions about the practice of survey research in repressive environments. Are Russians less willing today to respond to surveys? Are they less willing to answer sensitive questions? How can we design research to elicit truthful responses and to know whether respondents are answering insincerely about sensitive opinions? This article lays out some of the existing evidence on these important questions. It also makes the argument that cross-fertilization with other fields can help to ensure a rigorous understanding of and response to changes in the environment for survey research in Russia.
... The fundamental effect of SDB is the masking of the true response and thus poses a serious threat to the validity of the findings (Tourangeau et al. (2007) and Schill and Kirk (2017)). This remains noticeably relatable phenomenon in almost every field of social research, for more understanding of the pervasive nature of SDB see also Krumpal (2013), Rosenfeld et al. (2015), Hussain et al. (2019) and Vesely and Klöckner (2020). Figure 1 aims at the comprehension of the above documented discussion. ...
Chapter
Full-text available
This research primarily focuses on the proposition of a hybrid modeling strategy capable of entertaining the socially masked responses when stigmatized behaviors are under study. Our suggested approach respects the fact that individual's behaviors are embedded within their distinctive socio-cultural fabrics and the "need for social approval" phenomenon is very likely to affect respondent's responses. It is anticipated that ignoring the effects of varying cultural streams prevalent in society may result in un-interpretable and un-cohesive estimates projecting the social psychology. Our devised methodology, in its nature is a post-hoc remedial measure. In our study, we deal with the situation where data have already been collected and an initial analysis reveals the patterns pointing towards the existence of social desirability bias. We then are at crossroad with two options; (i) redo the data collection exercise, or (ii) provide the results disguised by the bias. Our scheme allows investigators to work with same data while taking into account the presence of desirability bias by incorporating a masking parameter in the model. The applicability of our proposed model is demonstrated by studying the contraceptive behaviors and their deriving factors in multi-linguistic, culturally diverse and relatively more rigid society. It is noteworthy that the likely prevalence of social desirability bias in above documented social profile remains highly expected. We examine a nationally representative sample of 17,446 ever married females, aged 15-49 years from Pakistan. The information were assembled through Pakistan Social and Living Standards Measurement (2013-14) survey; a data collection exercise launched by the Pakistan Bureau of Statistics. The effectiveness of the proposed technique is documented in comparison with existing modelling strategies that is generalized linear model and multi-level generalized linear model. The gains of accommodating the extent of masking in the data is evidenced by interpretability and the increased stability of resultant estimates.
Article
Explanations for ethnic voting have focused primarily on voters’ use of ethnicity as a heuristic for evaluating parties or candidates, or on the expressive benefits voting for coethnics may provide. This article describes and tests a largely overlooked explanation for ethnic voting resulting from group norms and social pressure. Employing a combination of experimental and observational data from Kenya—as well as observational data from three other African countries—it finds evidence that many voters have no intrinsic preference for coethnic candidates, but that their desire to conform to the norms of their ethnic community drives them to vote along ethnic lines. The results have important implications for our understanding of ethnic voting, as well as the conditions under which survey respondents provide truthful answers about group-related preferences.
Article
Full-text available
When using the customary direct questioning approach to collect sensitive information in a survey, some respondents may have no problem disclosing their true status or opinion, while others may be reluctant to reveal them. The problem of response bias is likely to arise as some responses tend to be more socially desirable than truthful. Hence, the indirect questioning approach can be used to guarantee privacy protection and reduce the influence of social desirability bias. This paper introduces a new technique which combines, both the direct and indirect questioning approaches and allows Bayesian estimates of the prevalence of multisensitive attributes to be obtained by taking into account the estimated proportion of honest respondents under direct questioning. Our proposal stems from a real survey in Taiwan and is illustrated with two motivating examples concerning voting behaviours and sexual identity. The empirical analysis reinforces that the proposed two‐stage multilevel method is satisfactory in mitigating the effects of respondents' self‐protective behaviour, and produces results that are more reliable than those based on the traditional direct questioning approach and better than a previous version of the multilevel randomised response method that ignores the presence of cheating behaviours in the direct questioning stage.
Article
We study secure survey designs in organizational settings where fear of retaliation makes it hard to elicit truth. Theory predicts that (i) randomized-response techniques offer no improvement because they are strategically equivalent to direct elicitation, (ii) exogenously distorting survey responses (hard garbling) can improve information transmission, and (iii) the impact of survey design on reporting can be estimated in equilibrium. Laboratory experiments confirm that hard garbling outperforms direct elicitation but randomized response works better than expected. False accusations slightly but persistently bias treatment effect estimates. Additional experiments reveal that play converges to equilibrium if learning from others’ experience is possible. (JEL C83, C90, D83, D91)
Article
Survey researchers have long protected respondent privacy via de‐identification (removing names and other directly identifying information) before sharing data. Unfortunately, recent research demonstrates that these procedures fail to protect respondents from intentional re‐identification attacks, a problem that threatens to undermine vast survey enterprises in academia, government, and industry. This is especially a problem in political science because political beliefs are not merely the subject of our scholarship; they represent some of the most important information respondents want to keep private. We confirm the problem in practice by re‐identifying individuals from a survey about a controversial referendum declaring life beginning at conception. We build on the concept of “differential privacy” to offer new data‐sharing procedures with mathematical guarantees for protecting respondent privacy and statistical validity guarantees for social scientists analyzing differentially private data. The cost of these procedures is larger standard errors, which can be overcome with larger sample sizes.
Article
Can reputational threat among coworkers reduce bribery in organizations? I exploit within- and across-organizational variation in bribery to design and implement a field experiment in the maternity wards of five Moroccan public hospitals. I test whether threatening to reveal information about ward workers’ involvement in bribery to their coworkers dissuades them from taking bribes from patients. Healthcare workers cut back on taking bribes in higher-incidence maternity wards but not in lower-incidence wards. Qualitative data show that bribery’s baseline incidence sets the costs of revealing. Workers tolerate only so much bribery in their wards before they face the negative social consequences of belonging to a work group that takes bribes. They thus correct their behavior when it crosses a threshold. Moreover, ineffective applications of the field interventions betrayed welfare-diminishing effects. I furnish evidence for a novel kind of policy lever against workplace bribery and shed new light on the dynamics of bribery inside organizations. Funding: Funding from different programs at Stanford University—Stanford Interdisciplinary Graduate Fellowship, Abbasi Program for Islamic Studies Summer Research Grant, Graduate Research Opportunity Grant, Sociology Research Opportunity Grant, Stanford Center on Philanthropy and Civil Society Grant, Freeman Spogli Institute’s Mentored Global Research Fellowship, and Stanford Institute for Innovation and Entrepreneurship in Developing Economies Fellowship—are gratefully acknowledged. Supplemental Material: The online appendices are available at https://doi.org/10.1287/orsc.2021.15264 .
Article
Traditional validation processes for psychological surveys tend to focus on analyzing item responses instead of the cognitive processes that participants use to generate these responses. When screening for invalid responses, researchers typically focus on participants who manipulate their answers for personal gain or respond carelessly. In this paper, we introduce a new invalid response process, discordant responding, that arises when participants disagree with the use of the survey and discuss similarities and differences between this response style and protective responding. Results show that nearly all participants reflect on the intended uses of an assessment when responding to items and may decline to respond or modify their responses if they are not comfortable with the way the results will be used. Incidentally, we also find that participants may misread survey instructions if they are not interactive. We introduce a short screener to detect invalid responses, the discordant response identifiers (DRI), which provides researchers with a simple validity tool to use when validating surveys. Finally, we provide recommendations about how researchers may use these findings to design surveys that reduce this response manipulation in the first place.
Article
This article analyses how high‐level bureaucrats evaluate the leadership of technocrat and partisan cabinet ministers in different roles of policymaking. The argument is that bureaucrats perceive ministers with policy expertise to have a central role in policymaking, especially in policy‐directing tasks. Despite their essential contribution to coalition formation, ministers with political experience are negatively evaluated in all policymaking roles. The article presents evidence based on an endorsement experiment conducted with the high‐level bureaucracy in Brazil. The results show that ministers with policy experience receive positive evaluations from the bureaucracy in policy formulation and implementation roles but not to carry out political coordination activities with the presidency or the legislature. Ministers with a partisan profile receive negative evaluations in all tasks of the policy process. Exploring the mechanism, we show that the negative assessment of ministers with a partisan profile is maintained even when the profile of the bureaucrat is considered. These results show the negative attitudes of high‐level bureaucrats towards partisan ministers in contexts of substantial patronage and corruption and contribute to the debate on ministerial appointments and their implications for policymaking.
Article
The article is devoted to understanding the problem of sensitivity in survey research. A retrospective analysis of the formation and development of the field of scientific knowledge, which in western sociology in the 1990’s was refereed to as “sensitive research”, is presented. A brief historical outline of the study of sensitive issues is given with an emphasis on the most prominent schools in world sociology and the most renowned authors who have made a significant contribution to the study of this topic (representatives of the Chicago School, A. Kinsey, S. Warner, G.S. Becker, R. Lee, C. Renzetti, R. Tourangeau, T. Yang and others). The early and modern conceptualizations of sensitivity are critically analyzed, the weaknesses and shortcomings of both expansive (J. Sieber and B. Stanley) and restrictive (N. Farberow) interpretations of this concept are shown. A multifactorial approach developed by R. Lee and K. Renzetti is considered as an alternative, one that takes into account various types of threats that determine the sensitive nature of the questions asked and the answers received. The social nature of sensitivity is discussed. It is shown how the socio-cultural context and the specifics of respondents’ perception of questions influence the results of survey studies. The main consequences of using sensitive issues in sociological research are also analyzed. At the same time, there are three most dangerous effects that have a detrimental effect on the quality of empirical data: weakening cooperation on behalf of respondents, the increase in the number of missing questions (non-answers) and the emergence of socially desirable (insincere) answers. The factors causing these effects are identified, and methods are proposed to help neutralize them. Conclusions are drawn about the socio-cultural conditionality of question sensitivity, its contextual and situational nature.
Article
We propose a simple, new technique to obtain truthful answers to sensitive, categorical questions. The Paired Response Technique (PRT) asks participants to merely report the sum of the answers to two paired questions, one baseline and one sensitive, with the answers to each separate question only known to the participants. The technique then statistically infers the prevalence of the sensitive characteristic and its potential drivers from the association of the baseline question with other questions in the survey. Monte Carlo simulations demonstrate the performance of the PRT under varying conditions. A representative survey (n = 4,649) in the Netherlands about legal and illegal purchases of prescription drugs to enhance sexual performance reveals that 17.4 % of the target population has purchased at least once medication to enhance sexual performance. In contrast, in a control group surveyed with direct questioning, only 5.1 % admit having done so. The great majority of these individuals opt to purchase illegally. Two further empirical applications, respectively, in the U.S. and in the U.K., show that the PRT reduces cognitive and affective costs of survey participation compared to a state-of-the-art Randomized Response Technique for categorical questions.
Article
How strongly embraced within the officer corps is the commitment to supporting and defending the Constitution and to the ethic of nonpartisanship? This article answers that question through a 2019/2020 survey of 1,470 service academy students, including with a list experiment. The results show that cadets engage in what we term “selective endorsement” of norms, whereby they endorse norms as long as they are not in tension with their partisan identities. In particular, the list experiment reveals that when provided an opportunity to obscure their preferences, many cadets supported following civilian orders, even those at odds with democratic traditions—and that partisan dynamics may play a role in determining how they respond. The article has important implications for scholarly research on norm robustness and socialization, as well as practical consequences for civil-military relations in light of ongoing challenges to democracy in the United States today.
Article
Full-text available
Coming out, or the disclosure of a minority identity, features prominently across disciplines, including several subfields of sociological research. In the context of sexuality, theoretical arguments offer competing predictions. Some studies propose that coming out is increasingly an unremarkable life transition as the stigma associated with non-heterosexualities attenuates, while others posit entrenched discrimination. Rather than testing these theories or providing incremental evidence in support of one position, we use 52 in-depth interviews with recently-out individuals to explain how identity disclosures in the present moment can validate plural possibilities. Our findings show that ambivalence is the core narrative which animates the contemporary coming out process. Respondents identify three interpretive frameworks that structure their experience of sexuality as at once incidental and central: generational differences, identity misrecognitions, and interfacing with institutions. We also detail a fourth theme, intersectionality, which shows the analytic limits of ambivalence in the coming out process. These patterns suggest more broadly that sexuality, like ethnicity, may provide symbolic resources—“distinguishing but not defining”—in the service of crafting a modern sexual self.
Chapter
Existing theories of election-related violence often assume that if elites instigate violence, they must benefit electorally from doing so. With a focus on Kenya, this book employs a wide array of data and empirical methods to demonstrate that - contrary to conventional wisdom - violence can be a costly strategy resulting in significant voter backlash. The book argues that politicians often fail to perceive these costs and thus employ violence as an electoral tactic even when its efficacy is doubtful. Election-related violence can therefore be explained not solely by the electoral benefits it provides, but by politicians' misperceptions about its effectiveness as an electoral tactic. The book also shows that violence in founding elections - the first elections held under a new multiparty regime - has long-lasting effects on politicians (mis)perceptions about its usefulness, explaining why some countries' elections suffer from recurrent bouts of violence while others do not.
Article
List experimentation is a common survey methodology that purports to reduce or eliminate social desirability bias. While some studies have assessed list experimentation’s effectiveness in achieving that goal, to our knowledge, this is the first ever experimental evaluation of interviewer effects on list experiment performance. We embedded a list experiment about immigration attitudes in an in-person survey administered to 718 white respondents. Randomly assigning Caucasian and Latinx interviewers, we find strong evidence that responses to the list experiment differed by interviewer ethnicity, thus failing to fully eliminate social desirability bias. A follow-up survey of 1,460 online respondents revealed similar difference-in-differences when merely priming the ethnic identities of survey researchers through pictures. The results of this study shed light on patterns of interpersonal communication about sensitive issues and how social context shapes the reporting of political attitudes, even when methodology specifically meant to mute sensitivity biases is employed.
Chapter
This paper develops a structural approach for modeling how respondents answer survey questions and uses it to estimate the proportion of respondents who are reticent in answering corruption questions, as well as the extent to which reticent behavior biases conventional estimates of corruption downwards. The context is a common two-step question, first inquiring whether a government official visited a business, and then asking about bribery if a visit was acknowledged. Reticence is a concern for both steps, since denying a visit side-steps the bribe question. This paper considers two alternative models of how reticence affects responses to two-step questions, with differing assumptions on how reticence affects the first question about visits. Maximum-likelihood estimates are obtained for seven countries using data on interactions with tax officials. Different models work best in different countries, but cross-country comparisons are still valid because both models use the same structural parameters. On average 40% of corruption questions are answered reticently, with much variation across countries. A statistic reflecting how much standard measures underestimate the proportion of all respondents who had a bribe interaction is developed. The downward bias in standard measures is highly statistically significant in all countries, varying from 12% in Nigeria to 90% in Turkey. The source of bias varies widely across countries, between denying a visit and denying a bribe after admitting a visit.
Article
The public often believes that men in political office are better at handling some issues or possess specific traits, compared to women. Do individuals reveal their true preferences on surveys that inquire about these political gender stereotypes? This article employs methods that allow researchers to examine true attitudes without pressuring individuals to explicitly reveal sensitive preferences. I use three experiments: a list experiment, a new group-count sensitive measure, and a question-wording experiment employed on the 2016 CES. I find little evidence of reluctance to share true attitudes about gender stereotypes across any of the measures. The results presented help confirm the importance of gender stereotypes in shaping political preferences in American politics today and undergird evidence in prior scholarship on stereotypes.
Preprint
Full-text available
Efforts to develop quantitative measures of support for political violence and related concepts have been increasing in the past. These measures are often treated as roughly interchangeable although, to date, it is unclear whether they are indeed comparable. Therefore, in the current study, we aimed to investigate whether and to which extent measures of political violence can be used interchangeably. We conducted an online survey and collected participants' responses on two direct measures of attitudes towards political violence, two indirect measures, and one behavioural measure. Results revealed that direct measures in the form of vignette approaches were able to differentiate between different kinds of political violence, whereas broader direct measures were fuzzier in their outcomes. We further found that indirect and behavioural measures for political violence were difficult to operationalise. The endorsement experiment as an indirect measure appeared most promising in this regard, although it did not correspond perfectly to the results of the direct measures. Disassembling measurements of political violence, we may contribute to improve quantitative research on radicalism, radicalization, and extremism.
Article
Full-text available
Responses to phone surveys tend to exhibit higher rates of social desirability bias and extreme responses when compared to face-to-face surveys. Yet, studies of mode effects typically compare either representative studies that implausibly assume comparability or experimental studies that rely on convenience samples. Our study compares two national probability samples but uses matching to address comparability. We study Costa Rica, a middle-income democracy, to see whether the conventional wisdom drawn from Western Europe and North America extends to the Global South. We analyze two nationally representative surveys, one fielded by phone and one face-to-face, allowing us to compare identically worded items we placed on both surveys. We find that phone respondents exhibited more socially desirable responding and were more likely to choose negative endpoints on scalar items. This suggests that survey researchers and practitioners should carefully assess the tradeoffs in shifting modes or employing mixed modes.
Article
Although political science increasingly investigates emotions as variables, it often ignores emotions’ larger significance due to their inherence in research with human subjects. Integrating emotions into conversations on methods and ethics, I build on the term “ethnographic sensibility” to conceptualize an “emotional sensibility” that seeks to glean the emotional experiences of people who participate in research. Methodologically, emotional sensibility sharpens attention to how participants’ emotions are data, influence other data, and affect future data collection. Ethically, it supplements Institutional Review Boards’ rationalist emphasis on information and cognitive capacity with appreciation for how emotions infuse consent, risk, and benefit. It thereby encourages thinking not only about emotional harm but also about emotions apart from harm and about emotional harms apart from trauma and vulnerability. I operationalize emotional sensibility by tracking four dimensions of research that affect participants’ emotions: the content of research, the context in which research occurs, researchers’ positionality, and researchers’ conduct.
Article
Following its February 2022 invasion of Ukraine, the Russian government sharply broadened what actions were illegal and raised the level of punishment. Many more topics of interest to survey researchers became politically sensitive. Questions about these topics may generate high levels of misleading responses and question-specific (item) non-responses, both of which introduce biases that undermine inference. We use survey data from 2015 and 2018 in Russia and neighboring countries to illustrate how these two problems were already issues prior to the invasion, especially for questions that invoked potential punishment by the state. In a climate of heightened state punishment, it becomes even more important to address misresponse and item non-response when interpreting survey data. We argue that, in addition to employing list experiments regularly and taking advantage of recent innovations in their design, scholars must develop ways to reduce item non-response and model how it biases estimates of interest.
Article
Full-text available
This article assesses the validity of responses to sensitive questions using four different methods. In an experimental setting, the authors compared a computer-assisted self-interview (CASI), face-to-face direct questioning, and two different varieties of randomized response. All respondents interviewed had been identified as having committed welfare and unemployment benefit fraud. The interviewers did not know that respondents had been caught for fraud, and the respondents did not know that the researchers had this information. The results are evaluated by comparing the percentage of false negatives. The authors also looked for variables that might explain why some respondents admit fraud and others do not. The proportions of respondents admitting fraud are relatively low, between 19 percent and 49 percent. The two randomized response conditions were superior in eliciting admissions of fraud. A number of background variables, notably gender, age, still receiving benefit, and duration and perception of fraud, are related to admitting fraud. Although the randomized response conditions performed much better than face-to-face direct questioning and CASI, the percentage of respondents admitting fraud is only around 50 percent. Some possible reasons for this are discussed.
Article
Full-text available
The list experiment, also known as the item count technique, is becoming increasingly popular as a survey methodology for eliciting truthful responses to sensitive questions. Recently, multivariate regression techniques have been developed to predict the unobserved response to sensitive questions using respondent characteristics. Nevertheless, no method exists for using this predicted response as an explanatory variable in another regression model. We address this gap by first improving the performance of a naive two-step estimator. Despite its simplicity, this improved two-step estimator can only be applied to linear models and is statistically inefficient. We therefore develop a maximum likelihood estimator that is fully efficient and applicable to a wide range of models. We use a simulation study to evaluate the empirical performance of the proposed methods. We also apply them to the Mexico 2012 Panel Study and examine whether vote-buying is associated with increased turnout and candidate approval. The proposed methods are implemented in open-source software.
Article
Full-text available
This article is an empirical contribution to the evaluation of the randomized response technique (RRT), a prominent procedure to elicit more valid responses to sensitive questions in surveys. Based on individual validation data, we focus on two questions: First, does the RRT lead to higher prevalence estimates of sensitive behavior than direct questioning (DQ)? Second, are there differences in the effects of determinants of misreporting according to question mode? The data come from 552 face-to-face interviews with subjects who had been convicted by a court for minor criminal offences in a metropolitan area in Germany. For the first question, the answer is negative. For the second, it is positive, that is, effects of individual and situational determinants of misreporting differ between the two question modes. The effect of need for social approval, for example, tends to be stronger in RRT than in DQ mode. Interviewer experience turns out to be positively related to answer validity in DQ and negatively in RRT mode. Our findings support a skeptical position toward RRT, shed new light on long-standing debates within survey methodology, and stimulate theoretical reasoning about response behavior in surveys.
Article
Full-text available
This paper discusses the validity of self-report data. It appears that self-report data are not equally valid among all ethnic groups. Rather large differences are apparent in the tendency of boys with official police contacts to admit delinquent activities. Youngsters from Morocco and Turkey were much more reticent about admitting delinquent activities than those born in The Netherlands or coming from Surinam. These differences in willingness to admit delinquent behaviour are related to social control variables, the number of police contacts, and knowledge of the Dutch language. A problem for etiological research is reported: variables which are considered to cause delinquency are also related to the tendency to admit involvement in criminality. Overall, arrest data probably provide the best indicator for comparing criminal involvement between ethnic groups. © 1989 The Institute for the Study and Treatment of Delinquency.
Article
Full-text available
Surveys usually yield rates of voting in elections that are higher than official turnout figures, a phenomenon often attributed to intentional misrepresentation by respondents who did not vote and would be embarrassed to admit that. The experiments reported here tested the social desirability response bias hypothesis directly by implementing a technique that allowed respondents to report secretly whether they voted: the “item count technique.” The item count technique significantly reduced turnout reports in a national telephone survey relative to direct self-reports, suggesting that social desirability response bias influenced direct self-reports in that survey. But in eight national surveys of American adults conducted via the Internet, the item count technique did not significantly reduce turnout reports. This mode difference is consistent with other evidence that the Internet survey mode may be less susceptible to social desirability response bias because of self-administration.
Article
Full-text available
Surveys usually yield reported rates of voting in elections that are higher than official turnout figures, a phenomenon often attributed to intentional misrepresentation by respondents who did not vote and would be embarrassed to admit that. The experiments reported here tested a procedure for reducing social desirability response bias by allowing respondents to report secretly whether they voted: the “randomized response technique.” In a national telephone survey of a sample of American adults and eight national surveys of American adults conducted via the Internet, respondents were either unable or unwilling to implement the randomized response technique properly, raising questions about whether this technique has ever worked properly to achieve its goals.
Article
Full-text available
We explored the limitations of self-reports as substitutes for observation of deviant behavior. Results of a study conducted in The Netherlands indicated negligible correspondence between respondents' self-reports of tax evasion and officially documented behavior. Nonsignificant correlations were obtained despite the fact that all government claims against the respondents had been settled, unprotested, before this study began and despite the respondents' awareness that the accuracy of their self-reports could be checked against their tax records. In addition, the results suggest that different explanatory variables may be correlated with each type of behavioral measure. In this instance, attitude toward the act (A act) measures and subjective norm measures exhibited significant correlations with the self-report data but not with officially documented behavior, and measures of more broadly focused personal dispositions predicted actual behavior but not self-reports. Such outcomes suggest that the explanatory power of the theory of reasoned action may not extend to the domain of socially proscribed behaviors where self-presentation concerns are likely to prompt both misrepresentations of past behavior and reports of attitudes and perceived norms consistent with those misrepresentations. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
In Exp I, 183 undergraduates read a persuasive message from a likable or unlikable communicator who presented 6 or 2 arguments on 1 of 2 topics. High involvement (HI) Ss anticipated discussing the message topic at a future experimental session, whereas low-involvement (LI) Ss anticipated discussing a different topic. For HI Ss, opinion change was significantly greater given 6 arguments but was unaffected by communicator likability. For LI Ss, opinion change was significantly greater given a likable communicator but was unaffected by the argument's manipulation. In Exp II with 80 similar Ss, HI Ss showed slightly greater opinion change when exposed to 5 arguments from an unlikable (vs 1 argument from a likable) communicator, whereas LI Ss exhibited significantly greater persuasion in response to 1 argument from a likable (vs 5 arguments from an unlikable) communicator. Findings support the idea that HI leads message recipients to employ a systematic information processing strategy in which message-based cognitions mediate persuasion, whereas LI leads recipients to use a heuristic processing strategy in which simple decision rules mediate persuasion. Support was also obtained for the hypothesis that content- vs source-mediated opinion change would result in greater persistence. (37 ref) (PsycINFO Database Record (c) 2004 APA, all rights reserved)
Article
Full-text available
The univariate and multivariate logistic regression model is discussed where response variables are subject to randomized response (RR). RR is an interview technique that can be used when sensitive questions have to be asked and respondents are reluctant to answer directly. RR variables may be described as misclassified categorical variables where conditional misclassification probabilities are known. The univariate model is revisited and is presented as a generalized linear model. Standard software can be easily adjusted to take into account the RR design. The multivariate model does not appear to have been considered elsewhere in an RR setting; it is shown how a Fisher scoring algorithm can be used to take the RR aspect into account. The approach is illustrated by analyzing RR data taken from a study in regulatory non-compliance regarding unemployment benefit.
Article
Full-text available
This article discusses two meta-analyses on randomized response technique (RRT) studies, the first on 6 individual validation studies and the second on 32 comparative studies. The meta-analyses focus on the performance of RRTs compared to conventional question-and-answer methods. The authors use the percentage of incorrect answers as effect size for the individual validation studies and the standardized difference score (d-probit) as effect size for the comparative studies. Results indicate that compared to other methods, randomized response designs result in more valid data. For the individual validation studies, the mean percentage of incorrect answers for the RRT condition is .38; for the other conditions, it is .49. The more sensitive the topic under investigation, the higher the validity of RRT results. However, both meta-analyses have unexplained residual variances across studies, which indicates that RRTs are not completely under the control of the researcher.
Article
Full-text available
Undergraduates expressed their attitudes about a product after being exposed to a magazme ad under conditions of either high or low product involvement. The ad contained either strong or weak arguments for the product and featured either prominent sports celebrities or average citizens as endorsers. The manipulation of argument quality had a greater impact on attitudes under high than low involve- ment, but the manipulation of product endorser had a greater impact under low than high involvement. These results are consistent with the view that there are two relatively distinct routes to persuasion.
Article
Full-text available
Psychologists have worried about the distortions introduced into standardized personality measures by social desirability bias. Survey researchers have had similar concerns about the accuracy of survey reports about such topics as illicit drug use, abortion, and sexual behavior. The article reviews the research done by survey methodologists on reporting errors in surveys on sensitive topics, noting parallels and differences from the psychological literature on social desirability. The findings from the survey studies suggest that misreporting about sensitive topics is quite common and that it is largely situational. The extent of misreporting depends on whether the respondent has anything embarrassing to report and on design features of the survey. The survey evidence also indicates that misreporting on sensitive topics is a more or less motivated process in which respondents edit the information they report to avoid embarrassing themselves in the presence of an interviewer or to avoid repercussions from third parties.
Article
Policy debates on strategies to end extremist violence frequently cite poverty as a root cause of support for the perpetrating groups. There is little evidence to support this contention, particularly in the Pakistani case. Pakistan's urban poor are more exposed to the negative externalities of militant violence and may in fact be less supportive of the groups. To test these hypotheses we conducted a 6,000-person, nationally representative survey of Pakistanis that measured affect toward four militant organizations. By applying a novel measurement strategy, we mitigate the item nonresponse and social desirability biases that plagued previous studies due to the sensitive nature of militancy. Contrary to expectations, poor Pakistanis dislike militants more than middle-class citizens. This dislike is strongest among the urban poor, particularly those in violent districts, suggesting that exposure to terrorist attacks reduces support for militants. Long-standing arguments tying support for violent organizations to income may require substantial revision.
Article
Challenging conventional wisdom, previous research in South Asia and the Middle East has shown that poverty and exposure to violence are negatively correlated with support for militant organizations. Existing studies, however, provide evidence consistent with two potential mechanisms underlying these relationships: (1) the direct effects of poverty and violence on attitudes toward militant groups and (2) the psychological effects of perceptions of poverty and violence on attitudes. Isolating whether the psychological mechanism is an important one is critical for building theories of mass responses to political violence. We conducted a series of original, large-scale survey experiments in Pakistan (n=16,279) in which we randomly manipulated perceptions of both poverty and violence before measuring support for militant organizations. We find evidence that psychological perceptions do in part explain why the poor seem to be less supportive of militant political groups.
Article
About a half century ago, in 1965, Warner proposed the randomized response method as a survey technique to reduce potential bias due to nonresponse and social desirability when asking questions about sensitive behaviors and beliefs. This method asks respondents to use a randomization device, such as a coin flip, whose outcome is unobserved by the interviewer. By introducing random noise, the method conceals individual responses and protects respondent privacy. While numerous methodological advances have been made, we find surprisingly few applications of this promising survey technique. In this article, we address this gap by (1) reviewing standard designs available to applied researchers, (2) developing various multivariate regression techniques for substantive analyses, (3) proposing power analyses to help improve research designs, (4) presenting new robust designs that are based on less stringent assumptions than those of the standard designs, and (5) making all described methods available through open-source software. We illustrate some of these methods with an original survey about militant groups in Nigeria.
Article
Randomized response is a survey technique for reducing response bias arising from respondent concern over revealing sensitive information. There has been some question whether bias reduction earned through the randomized response approach is sufficient to compensate for its inefficiency. By comparing self-reported arrests for two interview conditions (randomized response and direct question) with corresponding true scores appearing in police arrest files, a field-validation of a quantitative randomized response model was attempted. Overall, randomized response outperformed the more traditional direct-question method. Not only was there substantial reduction in mean response error, but the response error operative in the randomized response condition appeared to be random rather than systematic. A mean squared error comparison of the two conditions appears to assuage the concern over its relative inefficiency.
Article
Qualitative, quantitative, and ratio estimate randomized response models were tested in comparison with a conventional interview technique in the measurement of a sensitive issue with known true values. Results show that the randomized response model is successful in minimizing measurement error and provides more accurate estimates of sensitive behavior than conventional interview techniques.
Article
This validation study examined the joint effects of question threat and method of administration on response distortion using four interviewing techniques. The level of threat was varied by asking questions about library card ownership, voting, bankruptcy involvement, and having been charged with drunken driving. The results indicated that response distortion increased sharply as threat increased. None of the data methods was clearly superior to all other methods for all types of threatening questions. Randomized response gave the lowest distortion on questions about socially undesirable acts, but even with this procedure there was still a 35 percent understatement of drunken driving.
Article
Qualitative studies of vote buying find the practice to be common in many Latin American countries, but quantitative studies using surveys find little evidence of vote buying. Social desirability bias can account for this discrepancy. We employ a survey-based list experiment to minimize the problem. After the 2008 Nicaraguan municipal elections, we asked about vote-buying behavior by campaigns using a list experiment and the questions traditionally used by studies of vote buying on a nationally representative survey. Our list experiment estimated that 24% of registered voters in Nicaragua were offered a gift or service in exchange for votes, whereas only 2% reported the behavior when asked directly. This detected social desirability bias is nonrandom and analysis based on traditional obtrusive measures of vote buying is unreliable. We also provide systematic evidence that shows the importance of monitoring strategies by parties in determining who is targeted for vote buying. C lientelistic electoral linkages are characterized by a transaction of political favors in which politi-cians offer immediate material incentives to cit-izens or groups in exchange for electoral support. 1 Vote buying, which is a more particularized form of clien-telism involving the exchange of goods for votes at the individual level (Stokes 2007), has generated numerous ethnographies and surveys to measure its incidence and test-related hypotheses. While qualitative research rou-tinely finds vote buying to be pervasive in the developing world (e.g., Auyero 2001), individual-level surveys often uncover low levels of such exchanges (e.g., Transparency Ezequiel Gonzalez-Ocantos is a Ph.
Article
The validity of empirical research often relies upon the accuracy of self-reported behavior and beliefs. Yet eliciting truthful answers in surveys is challenging, especially when studying sensitive issues such as racial prejudice, corruption, and support for militant groups. List experiments have attracted much attention recently as a potential solution to this measurement problem. Many researchers, however, have used a simple difference-in-means estimator, which prevents the efficient examination of multivariate relationships between respondents' characteristics and their responses to sensitive items. Moreover, no systematic means exists to investigate the role of underlying assumptions. We fill these gaps by developing a set of new statistical methods for list experiments. We identify the commonly invoked assumptions, propose new multivariate regression estimators, and develop methods to detect and adjust for potential violations of key assumptions. For empirical illustration, we analyze list experiments concerning racial prejudice. Open-source software is made available to implement the proposed methodology.
Article
Surveys usually yield rates of voting in elections that are higher than official turnout figures, a phenomenon often attributed to intentional misrepresentation by respondents who did not vote and would be embarrassed to admit that. The experiments reported here tested the social desirability response bias hypothesis directly by implementing a technique that allowed respondents to report secretly whether they voted: the "item count technique." The item count technique significantly reduced turnout reports in a national telephone survey relative to direct self-reports, suggesting that social desirability response bias influenced direct self-reports in that survey. But in eight national surveys of American adults conducted via the Internet, the item count technique did not significantly reduce turnout reports. This mode difference is consistent with other evidence that the Internet survey mode may be less susceptible to social desirability response bias because of self-administration.
Article
Surveys usually yield reported rates of voting in elections that are higher than official turnout figures, a phenomenon often attributed to intentional misrepresentation by respondents who did not vote and would be embarrassed to admit that. The experiments reported here tested a procedure for reducing social desirability response bias by allowing respondents to report secretly whether they voted: the “randomized response technique.” In a national telephone survey of a sample of American adults and eight national surveys of American adults conducted via the Internet, respondents were either unable or unwilling to implement the randomized response technique properly, raising questions about whether this technique has ever worked properly to achieve its goals.
Article
How are civilian attitudes toward combatants affected by wartime victimization? Are these effects conditional on which combatant inflicted the harm? We investigate the determinants of wartime civilian attitudes towards combatants using a survey experiment across 204 villages in five Pashtun-dominated provinces of Afghanistan—the heart of the Taliban insurgency. We use endorsement experiments to indirectly elicit truthful answers to sensitive questions about support for different combatants. We demonstrate that civilian attitudes are asymmetric in nature. Harm inflicted by the International Security Assistance Force (ISAF) is met with reduced support for ISAF and increased support for the Taliban, but Taliban-inflicted harm does not translate into greater ISAF support. We combine a multistage sampling design with hierarchical modeling to estimate ISAF and Taliban support at the individual, village, and district levels, permitting a more fine-grained analysis of wartime attitudes than previously possible.
Article
List and endorsement experiments are becoming increasingly popular among social scientists as indirect survey techniques for sensitive questions. When studying issues such as racial prejudice and support for militant groups, these survey methodologies may improve the validity of measurements by reducing nonresponse and social desirability biases. We develop a statistical test and multivariate regression models for comparing and combining the results from list and endorsement experiments. We demonstrate that when carefully designed and analyzed, the two survey experiments can produce substantively similar empirical findings. Such agreement is shown to be possible even when these experiments are applied to one of the most challenging research environments: contemporary Afghanistan. We find that both experiments uncover similar patterns of support for the International Security Assistance Force (ISAF) among Pashtun respondents. Our findings suggest that multiple measurement strategies can enhance the credibility of empirical conclusions. Open-source software is available for implementing the proposed methods.
Article
A general linear randomized response model is established with estimates and variances obtained through analogy with familiar linear regression models. All existing randomized response procedures are shown to be special cases of this more general model. Some competing procedures are suggested by the applicability of the model for multivariate mixes of randomized and non-randomized response using either discrete or continuous random variables. Some additional applications are suggested by the applicability of the model for situations where the data have already been collected by some agency but where there are disclosure restrictions.
Article
We propose a new prior distribution for classical (non-hierarchical) logistic regres- sion models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student-t prior distributions on the coefficients. As a default choice, we recommend the Cauchydistribution with center 0 and scale 2.5, which in the simplest setting is a longer-tailed version of the distribu- tion attained by assuming one-half additional success and one-half additional failure in a logistic regression. Cross-validation on a corpus of datasets shows the Cauchy class of prior distributions to outperform existing implementations of Gaussian and Laplace priors. We recommend this prior distribution as a default choice for routine applied use. It has the advantage of always giving answers, even when there is complete separation in logistic regression (a common problem, even when the sample size is large and the number of predictors is small) and also automatically applying more shrinkage to higher- order interactions. This can be useful in routine data analysis as well as in automated procedures such as chained equations for missing-data imputation. We implement a procedure to fit generalized linear models in R with the Student-t prior distribution by incorporating an approximate EM algorithm into the usual itera- tively weighted least squares. We illustrate with several examples, including a series of logistic regressions predicting voting preferences, a small bioassay experiment, and an imputation model for a public health data set.
Article
The authors compare four methods of collecting information on abortion through survey research to measure the levels of induced abortion in Mexico: face-to-face interview (FTF), audio computer-assisted self-interview (ACASI), self-administered questionnaire (SAQ), and a random-response technique (RRT). They tested all methods in three samples: (1) hospital patients in Mexico City, (2) rural women in Chiapas, and (3) women randomly chosen as part of a house-to-house survey in Mexico City. In each sample, RRT found the highest rate of attempted induced abortion in the hospital, rural, and household samples (21.7, 36.1, and 17.9 percent, respectively), followed by the SAQ (19.3, 10.1, and 10.8 percent, respectively). The ACASI and FTF interviews yielded fewer reported abortion attempts. The RRT seems the most promising methodology to measure the levels of induced abortion. With SAQ, detailed information was obtained, and the reported frequency rates were slightly lower than the RRT rates in urban areas.
Article
This study presents a survey-based method for conducting inference into the determinants of sensitive political behavior. The approach combines two well-established literatures in statistical methods in the social sciences: the randomized response (RR) methodology utilized to reduce evasive answer bias and the generalized propensity score methodology utilized to draw inferences about causal effects in observational studies. The approach permits one to estimate the causal impact of a multivalued predictor variable of interest on a given sensitive behavior in the face of unknown interaction effects between the predictor and the confounders as well as nonlinearities in the relationship between the confounders and the sensitive behavior. Simulation results point to the superior performance of the RR relative to direct survey questioning using this method for samples of moderate to large size. The utility of the approach is illustrated through an application to corruption in the public bureaucracy in three countries in South America.
Article
Message recipients' recall of attitude-relevant beliefs and experiences was expected to affect message processing such that high-retrieval recipients base their opinions relatively more on an analysis of message validity, whereas recipients who could recall few beliefs and experiences base their opinions on noncontent features such as source cues. Indeed, high-retrieval recipients were unaffected by the likability of the message source, they demonstrated relatively good recall of the message position and arguments, and cognitive response data indicated that persuasion was enhanced by positive rather than negative reactions to message content. In contrast, low-retrieval recipients were more persuaded by likable and by expert sources than unlikable and nonexpert ones. Further, these recipients showed relatively poor recall of the message position and arguments, and cognitive response data suggested that persuasion was enhanced by positive rather than negative reactions to the communicator.
Article
HIV/AIDS is a disease whose only known prevention is behavioral. Risky sex is one of the ways in which people become infected with HIV, as well as other STDS. Estimating the base rates of risky sex and risky sex after drinking proves difficult. This study uses the unmatched‐count technique (UCT) to estimate base rates for sexual risk behaviors and sexual risk behaviors after drinking and compares the findings with those estimates found using conventional methods. UCT does not require the participant to directly answer sensitive questions, and, thus, may provide more accurate reporting than other methods. In a population of college students, the UCT revealed higher estimates of base rates for having had sex, having had sex without a condom, and having had sex without a condom after drinking than an anonymous self‐report survey. These higher estimates provide a better feel for the level of these risk behaviors, may help understand the relationship between alcohol and risky sex, and point to the need to target more interventions for condom use and condom use in the presence of drinking among college students.
Article
An abundance of survey research conducted over the past two decades has portrayed a “new South” in which the region's white residents now resemble the remainder of the country in their racial attitudes. No longer is the South the bastion of racial prejudice. Using a new and relatively unobtrusive measure of racial attitudes designed to overcome possible social desirability effects, our study finds racial prejudice to be still high in the South and markedly higher in the South than the non-South. Preliminary evidence also indicates that this prejudice is concentrated among white southern men. Comparison of these results with responses to traditional survey questions suggests that social desirability contaminates the latter. This finding helps to explain why the “new South” thesis has gained currency.
Article
An experimental CATI-survey (N=2041), asking sensitive questions about xenophobia and anti-Semitism in Germany, was conducted to compare the randomized response technique (RRT) and the direct questioning technique. Unlike the vast majority of RRT surveys measuring the prevalence of socially undesirable behaviors, only few studies have explored the effectiveness of the RRT with respect to the disclosure of socially undesirable opinions. Results suggest that the RRT is an effective method eliciting more socially undesirable opinions and yielding more valid prevalence estimates of xenophobia and anti-Semitism than direct questioning ('more-is-better' assumption). Furthermore, the results indicate that with increasing topic sensitivity, the benefits of using the RRT also increase. Finally, adapted logistic regression analyses show that several covariates such as education and generalized trust are related to the likelihood of being prejudiced towards foreigners and Jews.
Article
Many areas of personnel research are “sensitive.” We provide an empirical assessment of the unmatched count technique (UCT) to determine the base rate for a number of proscribed behaviors for professional auctioneers. To our knowledge, this is the first empirical application of a UCT technique in organizational studies. Advantages of the UCT are discussed including: (a) a more accurate estimate of the base rates for sensitive behavior, (b) absolute anonymity to subjects, (c) “legal immunity” to the researcher, and (d) facilitation of complete disclosure to subjects with no deception.
Article
Due to the inherent sensitivity of many survey questions, a number of researchers have adopted an indirect questioning technique known as the list experiment (or the item-count technique) in order to reduce dishonest or evasive responses. However, standard practice with the list experiment requires a large sample size, utilizes only a difference-in-means estimator, and does not provide a measure of the sensitive item for each respondent. This paper addresses all of these issues. First, the paper presents design principles for the standard list experiment (and the double list experiment) for the reduction of bias and variance as well as providing sample-size formulas for the planning of studies. Second, this paper proves that a respondent-level probabilistic measure for the sensitive item can be derived. This provides a basis for diagnostics, improved estimation, and regression analysis. The techniques in this paper are illustrated with a list experiment from the 2008–2009 American National Election Studies (ANES) Panel Study and an adaptation of this experiment.
Article
Political scientists have long been interested in citizens' support level for such actors as ethnic minorities, militant groups, and authoritarian regimes. Attempts to use direct questioning in surveys, however, have largely yielded unreliable measures of these attitudes as they are contaminated by social desirability bias and high nonresponse rates. In this paper, we develop a statistical methodology to analyze endorsement experiments, which recently have been proposed as a possible solution to this measurement problem. The commonly used statistical methods are problematic because they cannot properly combine responses across multiple policy questions, the design feature of a typical endorsement experiment. We overcome this limitation by using item response theory to estimate support levels on the same scale as the ideal points of respondents. We also show how to extend our model to incorporate a hierarchical structure of data in order to uncover spatial variation of support while recouping the loss of statistical efficiency due to indirect questioning. We illustrate the proposed methodology by applying it to measure political support for Islamist militant groups in Pakistan. Simulation studies suggest that the proposed Bayesian model yields estimates with reasonable levels of bias and statistical power. Finally, we offer several practical suggestions for improving the design and analysis of endorsement experiments.
Article
Standard estimation procedures assume that empirical observations are accurate reflections of the true values of the dependent variable, but this assumption is dubious when modeling self-reported data on sensitive topics. List experiments (a.k.a. item count techniques) can nullify incentives for respondents to misrepresent themselves to interviewers, but current data analysis techniques are limited to difference-in-means tests. I present a revised procedure and statistical estimator called LISTIT that enable multivariate modeling of list experiment data. Monte Carlo simulations and a field test in Lebanon explore the behavior of this estimator.
Article
Combating militant violence - particularly within South Asia and the Middle East - stands at the top of the international security agenda. Much of the policy literature focuses on poverty as a root cause of support for violent political groups and on economic development as a key to addressing the challenges of militancy and terrorism. Unfortunately, there is little evidence to support this contention, particularly in the case of Islamist militant organizations. To address this gap we conducted a 6000-person, nationally representative survey of Pakistanis that measures affect towards four important militant organizations. We apply a novel measurement strategy to mitigate item nonresponse, which plagued previous surveys due to the sensitive nature of militancy. Our study reveals three key patterns. First, Pakistanis exhibit negative affect toward all four militant organizations, with those from areas where groups have conducted the most attacks disliking them the most. Second, contrary to conventional expectations poor Pakistanis dislike militant groups more than middle-class citizens. Third, this dislike is strongest among poor urban residents, suggesting that the negative relationship stems from exposure to the externalities of terrorist attacks. Longstanding arguments tying support for violent political organizations to individuals’ economic prospects - and the subsequent policy recommendations - may require substantial revision.
Article
The item count technique is an indirect questioning technique that is used to estimate the proportion of people who have engaged in stigmatizing behavior. This technique is expected to yield a more appropriate estimate than the ordinary direct questioning technique because it requests respondents to indicate, based on a list of several items, simply the number of items that are applicable to them, including the target key item. An experimental web survey was conducted in an attempt to compare the direct questioning technique and the item count technique. Compared with the direct questioning technique, the item count technique yielded higher estimates of the proportion of shoplifters by nearly 10 percentage points, whereas the difference between the estimates using these two techniques was mostly insignificant with respect to innocuous blood donation. The survey results suggest that in the item count technique respondents tend to report fewer total behaviors compared to the direct question case. This tendency is more pronounced in the case of longer item lists. Three domain estimators for the item count technique were compared, and the cross-based method appeared to be the most appropriate method. Large differences in domain estimates for shoplifting between the item count and direct questioning techniques were found among female respondents, middle-aged respondents, respondents living in urban areas, and highly-educated respondents.
Article
The item count technique is a survey methodology that is designed to elicit respondents’ truthful answers to sensitive questions such as racial prejudice and drug use. The method is also known as the list experiment or the unmatched count technique and is an alternative to the commonly used randomized response method. In this article, I propose new nonlinear least squares and maximum likelihood estimators for efficient multivariate regression analysis with the item count technique. The two-step estimation procedure and the Expectation Maximization algorithm are developed to facilitate the computation. Enabling multivariate regression analysis is essential because researchers are typically interested in knowing how the probability of answering the sensitive question affirmatively varies as a function of respondents’ characteristics. As an empirical illustration, the proposed methodology is applied to the 1991 National Race and Politics survey where the investigators used the item count technique to measure the degree of racial hatred in the United States. Small-scale simulation studies suggest that the maximum likelihood estimator can be substantially more efficient than alternative estimators. Statistical efficiency is an important concern for the item count technique because indirect questioning means loss of information. The open-source software is made available to implement the proposed methodology.
Article
This study reports on the measurement problem in studying tax evasion behaviour of individuals. The three most frequently used methods in researching tax evasion (self-reports, officers' classification and experimental methods) are presented. Having observed a lack of association between self-report evasion behaviour and officers' classifications in a previous study (Elffers, Weigel and Hessing 1987), the authors report an empirical study in which the three measures were used on one and the same sample of taxpayers. Not only was the lack of association between self-reported behaviour and officers' classifications replicated but evasion in the experiment did not correlate with either of these. The authors conclude that tax evasion consists of at least three conceptually independent aspects that need to be assessed by three independent measures. Consequences for future research on tax evasion are discussed.
Article
While negative correlations have often been found between a respondent's education and his attitudes towards foreigners, the reasons for this education effect are still under debate. We examined the hypothesis that the highly educated may not be genuinely less xenophobic, but simply more prone to give socially desirable, xenophile answers in attitude questionnaires. We therefore compared the attitudes of respondents who were either questioned directly or using a cheating detection extension of the randomized-response technique (RRT). The latter is supposed to yield more honest answers to sensitive questions by experimentally offering the interviewee a higher degree of confidentiality. Under direct questioning conditions, we replicated the education effect; 75% of the highly educated expressed xenophile attitudes, as opposed to only 55% of the less educated. Under randomized-response conditions, we obtained significantly reduced estimates of 53% for the proportion of xenophiles among the highly educated, and 24% among the less educated, indicating a strong distortion of self-reported attitudes towards foreigners in both groups. However, a significant proportion of participants disobeyed the RRT instructions regardless of education. Because the education effect was found even after controlling for social desirability, it seems to be a genuine effect, rather than an artefact of a differential response bias. Copyright
Article
For various reasons individuals in a sample survey may prefer not to confide to the interviewer the correct answers to certain questions. In such cases the individuals may elect not to reply at all or to reply with incorrect answers. The resulting evasive answer bias is ordinarily difficult to assess. In this paper it is argued that such bias is potentially removable through allowing the interviewee to maintain privacy through the device of randomizing his response. A randomized response method for estimating a population proportion is presented as an example. Unbiased maximum likelihood estimates are obtained and their mean square errors are compared with the mean square errors of conventional estimates under various assumptions about the underlying population.
Article
The present study was designed to compare response rates on a standard self-report questionnaire that was nominally anonymous to an unmatched count questionnaire that allowed for true anonymity in responding. Four hundred and fifty-four college students were asked about several topics, including attitudes towards weight and shape, dieting, and eating disordered behavior using one of two response formats; either a standard questionnaire in true-false format or an unmatched count questionnaire that did not require participants to directly answer sensitive questions. Both males and females had significantly different rates of endorsement between the two methods of assessment on the majority of the eating-related questions. Response format and degree of anonymity affect endorsement of eating-related thoughts and behaviors. Understanding response bias is critical to determining accurate rates of eating disordered thoughts and behaviors.
Article
Gaining valid answers to so-called sensitive questions is an age-old problem in survey research. Various techniques have been developed to guarantee anonymity and minimize the respondent's feelings of jeopardy. Two such techniques are the randomized response technique (RRT) and the unmatched count technique (UCT). In this study we evaluate the effectiveness of different implementations of the RRT (using a forced-response design) in a computer-assisted setting and also compare the use of the RRT to that of the UCT. The techniques are evaluated according to various quality criteria, such as the prevalence estimates they provide, the ease of their use, and respondent trust in the techniques. Our results indicate that the RRTs are problematic with respect to several domains, such as the limited trust they inspire and non-response, and that the RRT estimates are unreliable due to a strong false "no" bias, especially for the more sensitive questions. The UCT, however, performed well compared to the RRTs on all the evaluated measures. The UCT estimates also had more face validity than the RRT estimates. We conclude that the UCT is a promising alternative to RRT in self-administered surveys and that future research should be directed towards evaluating and improving the technique.