Content uploaded by Kristen M Olson
Author content
All content in this area was uploaded by Kristen M Olson on Oct 26, 2021
Content may be subject to copyright.
University of Nebraska - Lincoln University of Nebraska - Lincoln
DigitalCommons@University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln
Sociology Department, Faculty Publications Sociology, Department of
2020
The Past, Present, and Future of Research on Interviewer Effects The Past, Present, and Future of Research on Interviewer Effects
Kristen M. Olson
Jolene Smyth
Jennifer Dykema
Allyson L. Holbrook
Frauke Kreuter
See next page for additional authors
Follow this and additional works at: https://digitalcommons.unl.edu/sociologyfacpub
Part of the Family, Life Course, and Society Commons, and the Social Psychology and Interaction
Commons
This Article is brought to you for free and open access by the Sociology, Department of at
DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Sociology Department,
Faculty Publications by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln.
Authors Authors
Kristen M. Olson, Jolene Smyth, Jennifer Dykema, Allyson L. Holbrook, Frauke Kreuter, and Brady T. West
1
The Past, Present, and Future of
Research on Interviewer Eects
Kristen Olson,1 Jolene D. Smyth,1 Jennifer Dykema,2
Allyson L. Holbrook,3 Frauke Kreuter,4 & Brady T. West 5
1 University of Nebraska-Lincoln
2 University of Wisconsin–Madison
3 University o Illinois at Chicago
4 University of Maryland, University of Mannheim, & Institute for
Employment Research
5 University of Michigan
Contents
1 Introduction 1
2 Training, Managing, and Monitoring Interviewers 3
4 Interviewers and Nonresponse 7
5 Interviewer Pace and Behaviors 8
7 Closing Thoughts 12
Acknowledgments 15
1.1 Introduction
Interviewer-administered surveys are a primary method of collect-
ing information from populations across the United States and the
digitalcommons.unl.edu
Published in ed. Kristen Olson,
Jolene D. Smyth, Jennifer Dykema, Allyson L. Holbrook, Frauke Kreuter, and Brady T. West
(2020), Boca Raton: CRC Press, pp. 3-15.
Copyright © 2020 Taylor & Francis Group, LLC. Used by permission.
(2020) 2
including large-scale government surveys that monitor populations
(e.g., the Current Population Survey), surveys used by the academic
community to understand what people think and do (e.g., the Gen-
eral Social Survey), and surveys designed to gauge public opinion
at a particular time point (e.g., the Gallup Daily Tracking Poll). In-
-
tude of ways, including creating lists of housing units for sampling,
persuading sampled units to participate, and administering survey
questions (Morton-Williams 1993). In an increasing number of sur-
veys, interviewers are also tasked with collecting blood, saliva, and
other biomeasures, and asking survey respondents for consent to
link survey data to administrative records (Sakshaug 2013). Inter-
-
have failed (e.g., the American Community Survey and the Agricul-
tural Resource Management Survey; de Leeuw 2005; Dillman, Smyth
and Christian 2014; Olson et al. 2019). In completing these varied
-
nard 2010; West and Blom 2017).
Errors introduced by interviewers can take the form of bias or
variance. Early research found that interviewers vary in how they
sample clusters in both face-to-face (Hansen, Hurwitz and Bershad
Mathiowetz and Cannell 1980). In particular, similar to a design ef-
fect for cluster samples, interviewers increase the variance of an
estimated mean as a function of their average workload (b) and the
intra-interviewer correlation (IIC) (the degree of within-interviewer
correlation in measurements): 1 + (b
– 1) IIC. IIC values range from
very small (0.001) to large (0.20) and larger (Elliott and West 2015;
with median values typically below 0.02. Given an IIC of 0.02 and
a workload of 50 interviews per interviewer, the variance of an es-
timated mean is almost doubled and standard errors are increased
by 40%. Thus, a fundamental goal of research on interviewers is
understanding what contributes to (and how to minimize) the IIC.
Even if the IIC is small, interviewer characteristics and behaviors
(2020) 3
can still bias responses. Interviewer sociodemographic characteris-
-
voices may be associated with responses to survey questions or indi-
cators of survey error (e.g., an indicator of a sampled unit responding
characteristics of interviewers, they typically bias estimates. In addi-
tion, research based on coding information about the interviewer-re-
-
reading questions, probing directively, and acting non-neutrally may
-
The goal of the
in February 2019 was to convene an in-
ternational group of leading academic, government, and industry re-
searchers and practitioners to discuss methods, practical insights, and
-
ers on multiple error sources, (2) evaluate study design and estima-
agenda for future studies of interviewers. After two days of presenta-
tions and posters, workshop participants spent the third day discuss-
ing and identifying areas where more work is needed. This chapter
introduces an edited volume consisting of chapters written by work-
the survey process. Second, we situate the chapters from this volume
-
search that arose from the third day of focused discussion.
2 Training, Managing, and Monitoring Interviewers
on the resultant survey data. Although standardized interviews are
the gold standard, a strict implementation of standardization may not
-
toring, and feedback systems vary widely across survey organizations
(2020) 4
(Vitema and Maynard 2002). Standardized interviewing commonly in-
-
vided, following up inadequate answers nondirectively, and acknowl-
edging adequate answers (Fowler and Mangione 1990). Yet some of
the central tenants of standardized interviewing- including reading
questions verbatim - are inadequately operationalized in actual train-
ing materials. Additionally, survey practitioners often make decisions
about interviewer training with little to no empirical evidence regard-
standardized training philosophy or materials since Fowler and Man-
this important issue, updating basic training of General Interview-
ing Techniques for question administration based on decades of re-
search on interviewer-respondent interaction and survey practice.
Among other things, their update tackles thorny issues of how to rec-
-
edging the critical role of characteristics of survey questions in deter-
mining what makes an answer codable), how to maintain respondent
engagement, and what common conversational practices can be al-
lowed in the interview.
-
tions for survey operations, it was clear during the workshop that this
research has not been fully adopted by survey organizations. We have
few recent descriptions of how survey organizations select, train, and
monitor survey interviewers, especially for smaller survey organiza-
tions. Yet many operational concerns are researchable, and such re-
search could yield both theoretical and practical insights. Miller and
and what kind of practical training methods may address these con-
-
(Chapter 4) meta-analysis of interviewer training includes many stud-
-
toring interviewers, including how to identify successful interviewers
(2020) 5
from a pool of applicants, how the pool of interviewers has changed
the supervisor on the interviewer. Future research is also needed to
-
ment, to shaping interviewer perceptions about their tasks, what com-
bination of in-person and online methods are best for delivering con-
maintain.
Many have questioned the utility of standardization, particularly
Schober and Conrad 1997; Suchman and Jordan 1990). Interviewers
often break from standardization on these types of items and ques-
tionnaires. Based on qualitative in-depth interviews with survey in-
terviewers and their reactions to vignettes, Kaplan and Yu (Chapter 5)
-
standardization occur. Innovations in technology, increased computing
paradata have dramatically changed how we monitor and train inter-
viewers, particularly for in-person interviewing (e.g., Edwards, Mai-
tland and Connor 2017; Olson and Wagner 2015). Edwards, Sun, and
as detected via monitoring using computer-assisted recorded inter-
viewing (CARI). These two chapters highlight the need to train inter-
the value of reinforcing this training through real-time monitoring.
Throughout the workshop, participants echoed the need for more in-
formation on how survey organizations monitor and provide feedback
and telephone interviews.
Although in-depth interviews and CARI recordings provide unique
process, many organizations lack resources to conduct such studies
and instead use paradata and interviewer observations to evaluate
et al. (Chapter 7) use paradata on interview duration and indicators
-
indicators help to identify problems and where they fail. West et al.
(Chapter 8) use data from two large surveys to assess whether post-
survey interviewer observations about the survey process are associ-
ated with indicators of measurement error. The sets of observations
that are most important to collect for monitoring and evaluation pur-
the types of survey questions asked and observations collected across
studies. More work is needed to align these observations across sur-
3 Interviewer Eects Across Contexts and Modes
to conduct interviews in private, many interviews are conducted in
when reporting about sensitive topics (Mneimneh et al. 2015). In com-
munity-based studies where in-community interviewers are recruited
-
-
iors and attitudes in the in-person Saudi National Mental Health Sur-
-
community who is known to them in a community-based participatory
research study in an American Indian community. Combined, these
studies reveal the need for more research into interview privacy and
Additionally, the mode and/or device for the interview - in-person,
(2020) 7
landline phone, cellular phone, or audio computer-assisted self-in-
terviewing; and interviewer input into a desktop or laptop, tablet, or
-
brook, Smyth and Olson 2018). Notably, the mode or device for the
interaction changes the nature of the interaction between interview-
-
ferences in the interviewer-respondent interaction across telephone
-
ferences in
message-based interviews. Conrad et al. (Chapter 11) replace human
the interviewer is virtual, but their voice is that of an audio-recorded
human. These chapters raise important questions about interview-
or otherwise) is an important feature of the surveyor where tasks are
critical for the interviewer to consider? Can virtual interviewers pro-
-
-
4 Interviewers and Nonresponse
-
Purdon 1997; Groves and Couper 1998) - due to both heuristic cues
their behaviors during the recruitment interaction (e.g., Couper and
nonresponse error variance (e.g., West, Kreuter and Jaenichen 2013;
-
amples of how tailoring is operationalized (e.g., Groves and Couper
1998). Research about tailoring generally relies on interviewer reports
with measures that vary across studies. To address this, Ackermann-
Piek, Korbmacher, and Krieger (Chapter 14) predict survey contact
(2020) 8
little replication in associations between the covariates and the non-
response outcomes across studies, they emphasize the importance of
real-time monitoring of interviewers.
15, Wescott discusses a case management model for a telephone sur-
about when to call cases. While interviewers report being more sat-
the model yields lower productivity than using a call scheduler. More
work is needed to understand how interviewer autonomy and insights
-
phone survey organization to increase interviewer engagement and
ultimately retain interviewers.
Survey interviews increasingly ask respondents to provide blood,
saliva, urine, or other biomeasures or for permission to link their sur-
vey data to administrative data (e.g., Jaszczak, Lundeen and Smith
-
deh, Cernat, and Sakshaug use nurse characteristics and paradata to
biomeasures for a general population survey. These stages—partici-
pating in the nurse visit, consenting to a blood sample, and obtaining
a blood sample—reveal substantial variation in the nonresponse out-
comes related to the nurses, and that the predictors of nonresponse
vary across the stages. This work and additional workshop discussion
suggest that we need more research on the antecedents and conse-
quences of interviewer variation in the ability to successfully collect
5 Interviewer Pace and Behaviors
al. (Chapter 17) test what interviewer and question characteristics
response latencies and indicators of respondent comprehension and
(2020) 9
time taken to administer and answer questions is associated with in-
terviewer, respondent, and question characteristics. Kelley (Chapter
-
erated from timing paradata can be used to identify question misread-
ings, testing three methods of setting thresholds. Olson and Smyth
behaviors over time.
—pace—their conceptualizations and operationalizations vary (e.g.,
interview duration, question duration, interviewer speaking time
or speed, response latencies, words per second, and questions per
minute; see also Chapter 21, in which Dahlhamer et al. use aver-
age number of seconds per question across a questionnaire). Other
-
ine interviewer and respondent behaviors (e.g., Fowler 2011; On-
how behavior coding is implemented (from live interviews, record-
ings, or transcripts, and focused on respondents, interviewers, or
both) and the variety of operational and analytical decisions often
made in such research. Such decisions include assigning codes at the
question level or the conversational turn level (Olson and Parkhurst
2013); coding the entire questionnaire or a subset of items (e.g., see
Fowler 2011; Mathiowetz and Cannell 1980) or in combination (e.g.,
-
-
ally or sequentially; and dealing with overlapping speech, interrup-
tions, and other normal conversational events. Even simple issues
of how many interviews to code and (acceptable) reliability of the
codes vary across studies.
Despite this heterogeneity, there are common patterns observed
-
ence (Holbrook et al., Chapter 17; Garbarski et al., Chapter 18; Olson
and Smyth, Chapter 20; Olson and Bilgen 2011; Olson and Peytchev
2007). Second, question characteristics drive this phenomenon. Fig-
ure 1 shows the percent of variance in question duration attributable
(2020) 10
to interviewers, respondents, and questions across multiple studies
with question-level duration as the outcome. Some models are esti-
mated as three-level multilevel models, so the residual variance ac-
counts for question- level variation. Other models are estimated as
-
itly estimated as part of the model. Across these four studies, inter-
viewer- and respondent- level variation is small and the question-
level (or residual that incorporates questions) variation is large.
There is a consistent tendency for longer questions and questions
written at higher grade levels to have longer durations (e.g., Gar-
barski et al. Chapter 18; Couper and Kreuter 2013; Olson and Smyth
2015). Yet other question characteristics, including question place-
ment, question sensitivity, type of question (attitude, factual knowl-
-
structions, parentheticals, battery items, and measures from survey
and Gallhofer 2007), are inconsistently parameterized or have incon-
-
the mechanisms for these connections still requires more work. One
clear direction from the workshop for future research was to identify
a common set of dependent variables related to pace and behaviors, a
Figure 1 Variance in question duration due to interviewers, respondents, and
questions.
(2020) 11
common set of question characteristics, and a common set of respon-
dent and interviewer characteristics, parameterize these identically,
-
-
of research is also of interest.
6 Estimating Interviewer Eects
-
-
-
els to study how properties of survey questions themselves may af-
-
panelli 1998, 1999).
-
-
over 100 outcomes in the National Health Interview Survey. They
-
those who administer questions at a slower pace. Similarly, Loosveldt
-
-
countries, comparing two analytic approaches to estimating these ef-
fects. Similar to Dahlhamer et al., Loosveldt and Wuyts meta-analyze
(2020) 12
-
viewer variance across independent groups of interviewers. In Chapter
23, West reviews design decisions needed to compare interviewer vari-
ance components across two groups of interviewers, using a unique
study in Germany comparing standardized and conversational ap-
-
viewers to conditions to power analyses to analytic decisions, this
practical insights into these complicated designs.
7 Closing Thoughts
chapters in this volume. Even though interviewers have been cen-
tral to data gathered to understand society since the beginning of
survey research, we know surprisingly little about them. Namely,
-
ganizations, how interviewers perceive their job, and how we can
-
portant issues - interviewers can do a lot of harm to survey data if
they try to do so (Chapter 7) but also inadvertently introduce error
into data even when they are attempting to follow their training. Un-
derstanding the challenges and constraints interviewers face will fa-
cilitate understanding the mechanisms underlying interviewer-re-
lated survey errors.
Most of the chapters in this volume use observational data. Obser-
vational research is constrained by the data that a research team has
available. Through the workshop and studies featured in this volume,
-
tionalizations and create more consistency across operational imple-
-
-
(2020) 13
-
nomenclature (e.g., Conrad and Schober 2000), as well as the con-
interviewer-related variance across countries (Loosveldt and Wuyts,
Chapter 22) reveals the need for understanding interviewing practice,
contains many studies conducted outside of the United States (e.g.,
-
future research.
As researchers and practitioners, we call on survey organizations
to make information on interviewers available in public-use analytic
Table 1 contains a list of concepts that workshop partici-
studies. At the bare minimum, an anonymized interviewer ID variable
-
-
viewer level, these include measures of work productivity and qual-
ity, measures of how thinly spread the interviewer is across multiple
from the interviewers themselves about their job, as well as mea-
-
formation about the amount, type, and content of interviewer train-
ing as well as details about the supervision and monitoring practices
and feedback provided to interviewers from supervisors or monitors
-
mation about the content of general interviewing techniques training
interviewers. Some of this information could be included in method-
ology reports. Although many organizations may consider information
about training, supervision, and monitoring to be proprietary, more
complete disclosure is certainly needed to understand these rarely
(2020) 14
Table 1 A List of Recommended Information About Interviewers to Include in Public-
use Data Files and Methodology Reports
Interviewer ID
Interviewer characteristics
Demographics (gender, race, age, educational status)
Personality assessments
Experience (within survey, within organization, across organizations)
Certication test scores
Number of other jobs they are currently working
Performance metrics and problems on other studies
Ever been red for other performance issues on other studies
Number of hours work on other studies
Interviewer expectation and attitudinal measures
Ratings of the importance of a completed interview
Ratings of the importance of obtaining high-quality data
Description of how sensitive and dicult questions are approached
Ratings of other variables related to job satisfaction and engagement
Interviewing training variables
Content of training and training methods (e.g., round-robin, type and content
of at-home study, hours/days of training by topic)
Participation in any specialized training (e.g., for refusal avoidance)
Number of trainings attended
Interview process variables
Sanitized post-survey observations
Number of other projects interviewers involved in current project worked on
during current project
Whether and how interviewers were matched to respondents
Measures of eld performance, including ICC information for variables
Adaptive design features and implementation
Interviewers’ notes about survey questions
Organization and study-specic characteristics
Information on recruiting, hiring, training, and attrition
Monitoring, feedback, and falsication detection activities
Description of the supervisory structure of the interviewer corps, e.g., number
of interviewers per supervisor
(2020) 15
studied, yet critically important, survey practices.
Finally, dissemination and integration of research on interviewers
into survey practice is hard. Many new practices may face cultural op-
position at organizations simply because it is not the way that work
has been done in the past. Clients, survey project managers, and su-
pervisors are often risk averse to trying something new, even if inte-
grating recommendations based on research can improve survey prac-
read the latest research, and survey methodologists are often discon-
-
We suggest a few ways forward. First, many professional associa-
-
ical researchers. Carving out space for these two disparate groups to
discuss mutually interesting problems can facilitate research translat-
ing into practice and practice informing research questions. For sev-
eral years the Midwest Association for Public Opinion Research has
sponsored a workgroup in which researchers and practitioners talk
about important topics on interviewers. In 2019, the discussion fo-
cused on interviewer training, for instance. Second, methodologists
who work at organizations with survey shops or who contract for re-
search with survey shops are well positioned for translation of re-
-
related to practice could inspire some changes in survey practice that
would improve data quality. Furthermore, recognizing contributions
other than simply research articles - for instance, data being available
in the public domain, availability of code, availability of interviewer
-
vey practices that worked or did not work would ease translation of
work from one research team to another. Although these are hard, we
think they are worthwhile future pursuits.
Acknowledgments The workshop and this edited volume were sup-
ported by the National Science Foundation (SES-1758834), the Charles
Cannell Fund in Survey Methodology of the Survey Research Center at
the University of Michigan, and the Rensis Likert Fund for Research
in Survey Methodology at the University of Michigan. Any opinions,
National Science Foundation.
References
Campanelli, P., P. Sturgis, and S. Purdon. 1997.
London:
Survey Methods Centre at SCPR.
interactions while using a mobile computer-assisted personal interview
device. 18:335-351.
Conrad, F. G. and M. F. Schober. 2000. Clarifying question meaning in a household
telephone survey.
Couper, M. P. and R. M. Groves. 2002. Introductory interactions in telephone
surveys and nonresponse. In
ed. D. W. Maynard, H. Houtkoop-
Wiley & Sons.
time and data quality. de
21:233-255.
Dijkstra, W. 1983. How interviewer variance can bias the results of research on
Dillman, D. A., J. D. Smyth, and L. M. Christian. 2014.
Hoboken, NJ: John Wiley &
Sons.
Edwards, B., A. Maitland, and S. Connor. 2017. Measurement error in survey
operations management. In ed. P. P. Biemer, E.
De Leeuw, S. Eckman, B. Edwards, F. Kreuter, L. E. Lyberg, N. C. Tucker, and B.
T. West, 253-277. Hoboken, NJ: John Wiley & Sons.
variance that is unaccounted for in single-stage health surveys.
Fowler, F. J. 2011. Coding the behavior of interviewers and respondents to
evaluate survey questions. In
ed. J. Madans, K. Miller, A. Maitland, and G. Willis, 7-21.
Hoboken, NJ: John Wiley & Sons.
Fowler, F. J. and T. W. Mangione. 1990.
Newbury Park, CA: Sage Publications.
(2020) 17
conversational practices, and rapport: Responsiveness and engagement in the
standardized survey interview.
understanding aid (QUAID): A web facility that helps survey methodologists
improve the comprehensibility of questions.
Groves, R. M. and M. Couper. 1998.
New York: John Wiley & Sons, Inc.
Telephone interviewer voice characteristics and the survey participation
decision. In ed. J. M. Lepkowski, C.
Tucker, J. M. Brick, E. D. de Leeuw, L. Japec, P. J. Lavrakas, M. W. Link, and R.
L. Sangster, 385-400. Hoboken, NJ: John Wiley & Sons.
in censuses and surveys.
9:no pp.
Jaszczak, A., K. Lundeen, and S. Smith. 2009. Using nonmedically trained
interviewers to collect biomeasures in a national in-home survey.
Mathiowetz, N. A. and C. F. Cannell. 1980. Coding interviewer behavior as a
method of evaluating performance.
2002.
New York: John Wiley & Sons, Inc.
for social conformity on reporting sensitive information.
Morton-Williams, J. 1993. Cambridge: University Press.
of interviewers in survey non-response.
(2020) 18
Olson, K. and B. Parkhurst. 2013. Collecting paradata for measurement error
evaluations. In
ed. F. Kreuter, 43--72. Hoboken, NJ: John Wiley & Sons.
pace and interviewer attitudes.
interviewers on response time.
Olson, K., J. D. Smyth, R. Horwitz, S. Keeter, V. Lesser, S. Marken, N. Mathiowetz,
Gurtekin, C. Turakhia, and J. Wagner. 2019.
to Oakbrook Terrace, IL: American
Association for Public Opinion Research.
interviewer travel behavior.
interviews.
Sakshaug, J. W. 2013. Using paradata to study response to within-survey requests.
In ed.
F. Kreuter, 171-190. Hoboken, NJ: John Wiley and Sons.
Sakshaug, J. W., M. P. Couper, M. B. Ofstedal, and D. R. Weir. 2012. Linking survey
and administrative records: Mechanisms of consent. &
Saris, W. E. and I. N. Gallhofer. 2007.
Hoboken, NJ: John Wiley and Sons.
in the standardized interview. In ed. P. Biemer,
John Wiley & Sons, Inc.
interviewing. In ed. P. V. Marsden and J. D.
Wright, 437-470. Bingley, UK: Emerald Group Publishing.
model of the call for survey participation: Actions and reactions in the survey
recruitment call.
and response: Predicting participation from the call opening.
Schober, M. F. and F. G. Conrad. 1997. Does conversational interviewing reduce
survey measurement error?
Suchman, L. and B. Jordan. 1990. Interactional troubles in face-to-face survey
interviews.
Timbrook, J., J. Smyth, and K. Olson. 2018. Why do mobile interviews take longer?
(2020) 19
A behavior coding perspective.
Viterna, J. and D. W. Maynard. 2002. How uniform is standardization?
Variation within and across survey research centers regarding protocols for
interviewing. In
ed. D. W. Maynard, H. Houtkoop-Steenstra, N. C.
synthesis.
face surveys: A function of sampling, measurement error, or nonresponse?
West, B. T. and K. Olson. 2010. How much of interviewer variance is really
nonresponse error variance?