Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

When potential survey respondents decide whether or not to participate in a telephone interview, they may consider what it would be like to converse with the interviewer who is currently inviting them to respond, e.g. how he or she sounds, speaks and interacts. In the study that is reported here, we examine the effect of three interactional speech behaviours on the outcome of survey invitations: interviewer fillers (e.g. 'um' and 'uh'), householders' backchannels (e.g. 'uh huh' and 'I see') and simultaneous speech or 'overspeech' between interviewer and householder. We examine how these behaviours are related to householders' decisions to participate (agree), to decline the invitation (refusal) or to defer the decision (scheduled call-back) in a corpus of 1380 audiorecorded survey invitations (contacts). Agreement was highest when interviewers were moderately disfluent—neither robotic nor so disfluent as to appear incompetent. Further, household members produced more backchannels, a behaviour which is often assumed to reflect a listener's engagement, when they ultimately agreed to participate than when they refused. Finally, there was more simultaneous speech in contacts where householders ultimately refused to participate; however, interviewers interrupted household members more when they ultimately scheduled a call-back, seeming to pre-empt householders' attempts to refuse. We discuss implications for hiring and training interviewers, as well as the development of automated speech interviewing systems.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... If the interaction between an interviewer and householder is recorded, each of these properties can be turned into measurements and paradata for analysis purposes. For example, characteristics of interactions such as telephone interviewers' speech rate and pitch, measured through acoustic analyses of audio-recorded interviewer introductions, have been shown to be associated with survey response rates (Sharf and Lehman, 1984;Oksenberg and Cannell, 1988;Benki et al., 2011;Conrad et al., 2013). ...
... One reason for these disparate findings may be related to nonlinearities in the relationship between acoustic measurements and survey outcomes. For example, Conrad et al. (2013) showed a curvilinear relationship between agreement to participate and the level of disfluency in the interviewers' speech across several phone surveys. Using data from Conrad et al. (2013), Figure 6 shows that agreement rates (plotted on the y-axis) are lowest when the interviewers spoke without any fillers (e.g., "uhm" and "ahms", plotted on the x-axis), often called robotic speech, and highest with a moderate number of fillers per 100 words. ...
... For example, Conrad et al. (2013) showed a curvilinear relationship between agreement to participate and the level of disfluency in the interviewers' speech across several phone surveys. Using data from Conrad et al. (2013), Figure 6 shows that agreement rates (plotted on the y-axis) are lowest when the interviewers spoke without any fillers (e.g., "uhm" and "ahms", plotted on the x-axis), often called robotic speech, and highest with a moderate number of fillers per 100 words. An interviewer's pitch also affects agreement rates-here interviewers with low pitch variation in their voice (the dashed line) were on average more successful in recruiting respondents than those with high pitch variation (the dotted line). ...
... If the interaction between an interviewer and householder is recorded, each of these properties can be turned into measurements and paradata for analysis purposes. For example, characteristics of interactions such as telephone interviewers' speech rate and pitch, measured through acoustic analyses of audio-recorded interviewer introductions, have been shown to be associated with survey response rates (Sharf and Lehman, 1984;Oksenberg and Cannell, 1988;Benki et al., 2011;Conrad et al., 2013). ...
... One reason for these disparate findings may be related to nonlinearities in the relationship between acoustic measurements and survey outcomes. For example, Conrad et al. (2013) showed a curvilinear relationship between agreement to participate and the level of disfluency in the interviewers' speech across several phone surveys. Using data from Conrad et al. (2013), Figure 6 shows that agreement rates (plotted on the y-axis) are lowest when the interviewers spoke without any fillers (e.g., "uhm" and "ahms", plotted on the x-axis), often called robotic speech, and highest with a moderate number of fillers per 100 words. ...
... For example, Conrad et al. (2013) showed a curvilinear relationship between agreement to participate and the level of disfluency in the interviewers' speech across several phone surveys. Using data from Conrad et al. (2013), Figure 6 shows that agreement rates (plotted on the y-axis) are lowest when the interviewers spoke without any fillers (e.g., "uhm" and "ahms", plotted on the x-axis), often called robotic speech, and highest with a moderate number of fillers per 100 words. An interviewer's pitch also affects agreement rates-here interviewers with low pitch variation in their voice (the dashed line) were on average more successful in recruiting respondents than those with high pitch variation (the dotted line). ...
Article
Full-text available
Nonresponse is a ubiquitous feature of almost all surveys, no matter which mode is used for data collection (Dillman et al., 2002) whether the sample units are households or establishments (Willimack et al., 2002) or whether the survey is mandatory or not (Navarro et al., 2012). Nonresponse leads to loss in efficiency and increases in survey costs if a target sample size of respondents is needed. Nonresponse can also lead to bias in the resulting estimates if the mechanism that leads to nonresponse is related to the survey variables (Groves, 2006). Confronted with this fact, survey researchers search for strategies to reduce nonresponse rates and to reduce nonresponse bias or at least to assess the magnitude of any nonresponse bias in the resulting data. Paradata can be used to support all of these tasks, either prior to the data collection to develop best strategies based on past experiences, during data collection using paradata from the ongoing process, or post hoc when empirically examining the risk of nonresponse bias in survey estimates or when developing weights or other forms of nonresponse adjustment. This chapter will start with a description of the different sources of paradata relevant for nonresponse error investigation, followed by a discussion about the use of paradata to improve data collection efficiency, examples of the use of paradata for nonresponse bias assessment and reduction, and some data management issues that arise when working with paradata.
... From other domains of interaction with virtual agents, the evidence is that people judge agents with more (bodily) motion as more acceptable and human (Piwek et al., 2014), and that realistic characters that move more are judged more positively (Hyde et al., 2013). The benefits of more human-like behavior may well extend to the survey context: Conrad et al. (2013) demonstrated that people invited to participate in (humanadministered ) telephone survey interviews were more likely to agree to participate when the interviewers spoke less robotically (with more disfluencies) during the invitation interaction. And Foucault Welles and Miller (2013) demonstrated that respondents in face-to-face (human-administered) survey interviews reported feeling greater rapport (which is presumably related to their feelings of engagement) when interviewers nodded and smiled more, and when they gazed at respondents' faces less. ...
... Note that all of these ratings are lower than one would expect if the respondents evaluated the virtual interviewer as being very human-like. But given the constraints of a standardized interviewing situation, it is also plausible that human interviewers who implemented these interviews would not be rated as particularly autonomous, personal, close, or sensitive, and they might also be rated as more robotic than human (the term " robotic " is sometimes used to caricature the behavior of rigidly standardized interviewers, for example in survey invitations, see Conrad et al., 2013). As detailed inTable 5, Hypothesis 4 is supported on several fronts. ...
Article
Full-text available
This study investigates how an onscreen virtual agent's dialog capability and facial animation affect survey respondents' comprehension and engagement in "face-to-face" interviews, using questions from US government surveys whose results have far-reaching impact on national policies. In the study, 73 laboratory participants were randomly assigned to respond in one of four interviewing conditions, in which the virtual agent had either high or low dialog capability (implemented through Wizard of Oz) and high or low facial animation, based on motion capture from a human interviewer. Respondents, whose faces were visible to the Wizard (and videorecorded) during the interviews, answered 12 questions about housing, employment, and purchases on the basis of fictional scenarios designed to allow measurement of comprehension accuracy, defined as the fit between responses and US government definitions. Respondents answered more accurately with the high-dialog-capability agents, requesting clarification more often particularly for ambiguous scenarios; and they generally treated the high-dialog-capability interviewers more socially, looking at the interviewer more and judging high-dialog-capability agents as more personal and less distant. Greater interviewer facial animation did not affect response accuracy, but it led to more displays of engagement-acknowledgments (verbal and visual) and smiles-and to the virtual interviewer's being rated as less natural. The pattern of results suggests that a virtual agent's dialog capability and facial animation differently affect survey respondents' experience of interviews, behavioral displays, and comprehension, and thus the accuracy of their responses. The pattern of results also suggests design considerations for building survey interviewing agents, which may differ depending on the kinds of survey questions (sensitive or not) that are asked.
... (The conventions for acknowledging relative power and mitigating a request probably vary for different populations.) To complete our analysis of the first turn, we include measures of disfluency (e.g., Conrad, Broome, Benki, Kreuter, Groves, et al. 2013) that may affect a sample member's perception of the interviewer as a competent interactional partner. ...
... Most previous research about acoustic or perceived properties of speakers during the opening of the recruitment call has focused on the interviewer and not specifically on "hello" (e.g., Oksenberg and Cannell 1988;Oksenberg, Coleman, and Cannell 1986;van der Vaart, Ongena, Hoogendoom, and Dijkstra 2006;Groves, O'Hare, Gould-Smith, Benki, and Maher 2008;Conrad et al. 2013). For example, Benki, Broome, Conrad, Kreuter, and Groves (2011) considered the interviewer's average median pitch and variability in pitch over the first 13 turns, not just "hello." ...
Article
Although researchers have used phone surveys for decades, the lack of an accurate picture of the call opening reduces our ability to train interviewers to succeed. Sample members decide about participation quickly. We predict participation using the earliest moments of the call; to do this, we analyze matched pairs of acceptances and declinations from the Wisconsin Longitudinal Study using a case-control design and conditional logistic regression. We focus on components of the first speaking turns: acoustic-prosodic components and interviewer's actions. The sample member's "hello" is external to the causal processes within the call and may carry information about the propensity to respond. As predicted by Pillet-Shore (2012), we find that when the pitch span of the sample member's "hello" is greater the odds of participation are higher, but in contradiction to her prediction, the (less reliably measured) pitch pattern of the greeting does not predict participation. The structure of actions in the interviewer's first turn has a large impact. The large majority of calls in our analysis begin with either an "efficient" or "canonical" turn. In an efficient first turn, the interviewer delays identifying themselves (and thereby suggesting the purpose of the call) until they are sure they are speaking to the sample member, with the resulting efficiency that they introduce themselves only once. In a canonical turn, the interviewer introduces themselves and asks to speak to the sample member, but risks having to introduce themselves twice if the answerer is not the sample member. The odds of participation are substantially and significantly lower for an efficient turn compared to a canonical turn. It appears that how interviewers handle identification in their first turn has consequences for participation; an analysis of actions could facilitate experiments to design first interviewer turns for different target populations, study designs, and calling technologies.
... (The conventions for acknowledging relative power and mitigating a request probably vary for different populations.) To complete our analysis of the first turn, we include measures of disfluency (e.g., Conrad et al. 2013), which may affect a sample member's perception of the interviewer as a competent interactional partner. ...
... Most previous research about acoustic or perceived properties of speakers during the opening of the recruitment call focus on the interviewer and not specifically on "hello" (e.g., Oksenberg, Coleman, and Cannell 1986;van der Vaart et al. 2006;Groves et al. 2008;Conrad et al. 2013). For example, Benkí et al. (2011) considered the interviewer's average median pitch and variability in pitch over the first 13 turns, not just "hello." ...
... In contrast to face-to-face interviewers, telephone survey interviewers have just two primary tools that are under their control in their efforts to persuade answerers to participate: what they say (speech) and how they say it (vocal characteristics). A small body of literature (e.g., Sharf and Lehman 1984; Oksenberg et al. 1986; Oksenberg and Cannell 1988; Groves et al. 2007; Conrad et al. 2013) finds relationships between vocal characteristics of interviewers in telephone-survey introductions and interviewer success in obtaining interviews. In general, successful interviewers have been ones who spoke louder (Oksenberg et al. 1986; Oksenberg and Cannell 1988; van der Vaart et al. 2005) and with more falling intonation (Sharf and Lehman 1984; Oksenberg and Cannell 1988). ...
... These introductions, conducted by 100 interviewers, were from five telephone surveys that were audio recorded for another project. In this project, all contacts associated with selected households, regardless of who the interviewer was, were included in the dataset (Benkí et al. 2011; Conrad et al. 2013). Contacts by 49 different interviewers with ranging lengths of tenure and response rates varying over the course of their tenure at University of Michigan from .07 – .21 ...
Article
Full-text available
Survey nonresponse may increase the chances of nonresponse error, and different interviewers contribute differentially to nonresponse. This article first addresses the relationship between initial impressions of interviewers in survey introductions and the outcome of these introductions, and then contrasts this relationship with current viewpoints and practices in telephone interviewing. The first study described here exposed judges to excerpts of interviewer speech from actual survey introductions and asked them to rate twelve characteristics of the interviewer. Impressions of positive traits such as friendliness and confidence had no association with the actual outcome of the call, while higher ratings of “scriptedness” predicted lower participation likelihood. At the same time, a second study among individuals responsible for training telephone interviewers found that when training interviewers, sounding natural or unscripted during a survey introduction is not emphasized. This article concludes with recommendations for practice and further research.
... The production of uhm or uh which serves as a filler in oral production is common even among native speakers. Fillers give the opportunity for listeners to process the information delivered and a speaker may seem less trustworthy if fillers are eliminated (Conrad et al., 2013). Within the social context of second language setting, the rigorous use of uh or uhm may suggest that Zam has limited proficiency of English. ...
Article
Full-text available
Graduates with high level of social intelligence are in high demand. Those who can demonstrate having good attitude and social flexibility, able to build good relationships with others, and able to use appropriate language during their interviews have higher chances of being employed. However, reports show that graduates of higher learning institutions lack social intelligence. Failure to address this issue can affect future graduates' employability. This paper examined the extent to which students in higher learning institutions had developed social intelligence. The participants were the final year students of one technical university in Malaysia. These participants underwent a job interview session as one of their course assessments. In this paper, the responses by three participants were selected to be analysed. The mock job interview sessions, which were conducted online, were recorded and transcribed. The data were analysed using multimodal social semantic discourse analysis to determine the candidates' intention and then, using the five dimensions of social intelligence which included social awareness, presence, authenticity, clarity, and empathy to examine the presence (absence) of social intelligence. It was found that social awareness and empathy to build relationships and develop trust with the interviewer were taken for granted by the participants. In addition, the participants' focused more on the qualification and skills that they had rather than how they could use their skills for the benefit of a company or organisation. The findings provided invaluable input on ways in designing courses that promote the development of social intelligence among students of higher learning institutions.
... For telephone surveys, seeConrad et al., 2013;Maynard & Schaeffer, 1997;Schaeffer et al., 2013; etc. 6 See Schaeffer & Presser (2003) for more information on question design, as well asHeritage (2002), Houtkoop-Steenstra (2002),Lavin & Maynard (2002),Moore & Maynard (2002), andViterna & Maynard (2002) for CA-based analysis of this phase of the survey. Interestingly, the tradition that argues for analyzing individual responses in isolation from one another also has a considerable literature on question order and "context effects" (see, e.g.,Smyth et al., 2009). ...
Article
While a growing body of work has focused on the interactional organization of telephone survey interviews, little if any research in conversation and discourse analysis has examined written online surveys as a form of talk-in-interaction. While survey researchers routinely examine such responses using content analysis or thematic analysis methods, this shifts the focus away from the precise language and turn constructional practices used by respondents. By contrast, in this study we examine open-ended text responses to online survey questions using a conversation analytic and discourse analytic approach. Focusing on the precise turn constructional practices used by survey respondents—specifically, how they formulate multi-unit responses and make use of turn-initial discourse markers—we demonstrate how online survey respondents treat open-ended survey questions much as they would any similar sequence of interaction in face-to-face or telephone survey talk, making online surveys a tenable source of data for further conversation analytic inquiry.
... In another recent study, Conrad, Broome, et al. (2013) analyzed a sample of 1,380 audio-recorded telephone invitations to five different surveys by 100 different interviewers from the University of Michigan Survey Research Center, made in most cases as "cold calls." The results demonstrate that invitations in which interviewers were moderately disfluent (e.g., using "um" and "uh" at a moderate rate) were more successful than invitations that were "robotic" (overly fluent) or painfully disfluent. ...
Article
Many of the official statistics and leading indicators that inform policy decisions are created from aggregating data collected in scientific survey interviews. What happens in the back-and-forth of those interviews—whether a sampled member of the public agrees to participate or not, whether a respondent comprehends questions in the way they were intended or not, whether the interview is spoken or texted—can thus have far-reaching consequences. But the landscape for social measurement is rapidly changing: Participation rates are declining, and people’s daily communication patterns are evolving with new technologies (text messaging, video chatting, social media posting, etc.). New analyses of survey interactions are demonstrating aspects of interviewer speech that can substantially affect survey participation, which is vital if social measurement is to be trustworthy. Findings also suggest that, once a survey interview starts, the risks of misunderstanding and miscommunication are greater than one might expect, potentially jeopardizing the accuracy of survey results; different approaches to interviewing that allow clarification dialogue can improve respondents’ comprehension and thus survey data quality. Analyses of text messaging and voice interviews on smartphones demonstrate the importance of adapting scientific social measurement to new patterns of communication, adding ways for people to contribute their data at a time and in a mode that is convenient for them even when they are mobile or multitasking.
... Post-survey adjustments typically rely on one or more auxiliary measures Z, which are available for each sampled unit and correlated with both the response outcome and survey variables (Kalton and Flores-Cervantes, 2003;Little and Vartivarian, 2005;Bethlehem et al., 2011;Brick, 2013). The search for suitable Z variables has recently turned to paradata, which refer to data generated as a by-product of the survey process (Couper, 1998;Blom, 2009;Olson, 2013;Conrad et al., 2013;Kreuter and Olson, 2013). These data are attractive because they are available for all sampled units at little extra cost (Groves, 2006). ...
Article
Full-text available
Sequence analysis is widely used in life course research and more recently has been applied by survey methodologists to summarize complex call record data. However, summary variables derived in this way have proved ineffective for post-survey adjustments, owing to weak correlations with key survey variables. We reflect on the underlying optimal matching algorithm and test the sensitivity of the output to input parameters or ‘costs’, which must be specified by the analyst. The results illustrate the complex relationship between these costs and the output variables which summarize the call record data. Regardless of the choice of costs, there was a low correlation between the summary variables and the key survey variables, limiting the scope for bias reduction. The analysis is applied to call records from the Irish Longitudinal Study on Ageing, which is a nationally representative, face-to-face household survey.
... Verification and confirmation request (e.g., "you said your husband is that correct?") Our coding scheme complements those developed by others in terms of describing the content of utterances and, to a certain extent, the placement of utterances (see, e.g., Conrad et al. 2013;Dijkstra 1999;Dijkstra and Ongena 2006;Dykema, Lepkowski, and Blixt, 1997;Dykema and Schaeffer 2004;Schaeffer et al. 2013;Schober et al. 2012;Schober and Bloom 2004;Schober, Conrad, and Fricker 2004). These coding schemes and ours have in common attending to content of utterances--even the most micro utterances such as what have been called "fillers" ...
Article
Full-text available
“Rapport” has been used to refer to a range of positive psychological features of an interaction -- including a situated sense of connection or affiliation between interactional partners, comfort, willingness to disclose or share sensitive information, motivation to please, or empathy. Rapport could potentially benefit survey participation and response quality by increasing respondents’ motivation to participate, disclose, or provide accurate information. Rapport could also harm data quality if motivation to ingratiate or affiliate caused respondents to suppress undesirable information. Some previous research suggests that motives elicited when rapport is high conflict with the goals of standardized interviewing. We examine rapport as an interactional phenomenon, attending to both the content and structure of talk. Using questions about end-of-life planning in the 2003-2005 wave of the Wisconsin Longitudinal Study, we observe that rapport consists of behaviors that can be characterized as dimensions of responsiveness by interviewers and engagement by respondents. We identify and describe types of responsiveness and engagement in selected question-answer sequences and then devise a coding scheme to examine their analytic potential with respect to the criterion of future study participation. Our analysis suggests that responsive and engaged behaviors vary with respect to the goals of standardization—some conflict with these goals, while others complement them.
... An additional set of vocal characteristics distinct from those typically examined through behavior codes are interruptions to a fluid speech pattern, such as disfluencies ("uh," "um;" Ehlen et al., 2007), backchannels ("I see," "uh huh;" Conrad et al., 2013;Jans, 2010), or laughter (Bilgen, 2011). These behaviors are not directly task related, but instead are related to normal conversational behaviors (Jans, 2010). ...
Chapter
Full-text available
Paradata can be collected at a variety of levels, resulting in a complex, hierarchical data structure. This chapter describes a wide variety of types of paradata, the kinds of paradata available by mode, and some of the challenges involved in turning paradata into analytic variables. These paradata include automatically captured timing data, keystroke data, and mouse click data, and researcher-designed behavior codes, vocal characteristics, and interviewer evaluations. Measurement-error-related paradata can be collected at four levels of aggregation— the survey level, the section level, the question level, and the action level. The types of paradata that can be captured vary by mode of data collection, driven largely by the software being used for data collection and the people who are interacting with the survey instrument, that is, interviewer or respondent.
... Many studies have utilized audio recordings of the survey introduction, either on the doorstep or the telephone, to better understand variability in interviewer behaviors during this important interaction. Studies of the vocal characteristics and speech patterns of interviewers have consistently found that moderate speech characteristics (e.g., not speaking too fast or too slow) during the introduction increase cooperation rates (Oksenberg, Coleman, and Cannell 1986;Oksenberg and Cannell 1988;Groves, O'Hare, Gould-Smith, Benki, and Maher 2008;Conrad, Broome, Benki, Kreuter, Groves, et al. 2013). Morton-Williams (1993) found that interviewers who used concise introductions, maintained interaction, used social skills to tailor responses to reluctance, and used friendlier introductions produced higher response rates. ...
Article
A rich and diverse literature exists on the effects that human interviewers can have on different aspects of the survey data collection process. This research synthesis uses the Total Survey Error (TSE) framework to highlight important historical developments and advances in the study of interviewer effects on a variety of important survey process outcomes, including sample frame coverage, contact and recruitment of potential respondents, survey measurement, and data processing. Included in the scope of the synthesis is research literature that has focused on explaining variability among interviewers in these effects and the different types of variable errors that they can introduce, which can ultimately affect the efficiency of survey estimates. We first consider common tasks with which human interviewers are often charged and then use the TSE framework to organize and synthesize the literature discussing the variable errors that interviewers can introduce when attempting to execute each task. Based on our synthesis, we identify key gaps in knowledge and then use these gaps to motivate an organizing model for future research investigating explanations for interviewer effects on different aspects of the survey data collection process.
... The second type of respondent behaviors includes nonverbal utterances such as disfluencies and laughter. Nonverbal utterances are part of normal conversational behaviors and are not directly related to the task of responding (Jans 2010;Conrad et al. 2013). Speech disfluencies such as fillers ("ums" and "uhs"), stutters, and repairs are related to comprehension problems and difficulties with tasks requiring higher cognitive ability (e.g., Schober and Bloom 2004). ...
Article
Full-text available
Survey interviewers are often tasked with assessing the quality of respondents’ answers after completing a survey interview. These interviewer observations have been used to proxy for measurement error in interviewer-administered surveys. How interviewers formulate these evaluations and how well they proxy for measurement error has received little empirical attention. According to dual-process theories of impression formation, individuals form impressions about others based on the social categories of the observed person (e.g., sex, race) and individual behaviors observed during an interaction. Although initial impressions start with heuristic, rule-of-thumb evaluations, systematic processing is characterized by extensive incorporation of available evidence. In a survey context, if interviewers default to heuristic information processing when evaluating respondent engagement, then we expect their evaluations to be primarily based on respondent characteristics and stereotypes associated with those characteristics. Under systematic processing, on the other hand, interviewers process and evaluate respondents based on observable respondent behaviors occurring during the question-answering process. We use the Work and Leisure Today Survey, including survey data and behavior codes, to examine proxy measures of heuristic and systematic processing by interviewers as predictors of interviewer postsurvey evaluations of respondents’ cooperativeness, interest, friendliness, and talkativeness. Our results indicate that CATI interviewers base their evaluations on actual behaviors during an interview (i.e., systematic processing) rather than perceived characteristics of the respondent or the interviewer (i.e., heuristic processing). These results are reassuring for the many surveys that collect interviewer observations as proxies for data quality. Supplemental material is attached below.
Chapter
This chapter reviews the existing literature on the quality of paradata. It considers studies presenting both direct evaluations of the quality of a variety of collected paradata, where “gold standard” validation data are available, as well as indirect indicators of the quality, including reliability, ease of collection, and missing data issues. The chapter provides an overview of the available literature on computer-generated paradata and interviewer-recorded call records. The chapter also focuses on the quality of interviewer observations, as most of the current research focuses on this type of paradata. It concludes with a theoretical discussion of mechanisms that could introduce error in various types of paradata.
Article
Survey methodology is a relatively new academic discipline focused on understanding sources of survey errors. As an interdisciplinary field, survey methodology borrows theoretical approaches from other disciplines and applies them to understand how survey respondents answer questions. One field in particular, cognitive psychology, has played a central role in the development of survey methodology. The cognitive approach has focused researchers' attentions on the sources of error at each stage of the cognitive process respondents use to answer a survey question: comprehension of the question, recollection of relevant information, estimation and judgment, and reporting an answer. Although this focus on the cognitive response process has been positive and fruitful, potentially strong social and interactional influences on the response process have been underinvestigated and undertheorized. Thus, this essay argues for a revitalized research program in the sociological social psychology of survey methodology, given its rich body of theory and research. The current strengths of social psychological and interactional approaches are highlighted, focusing primarily on recent work using identity theory to understand social desirability biases. Finally, potentially fruitful future directions for research are proposed, matching sociological social psychological theories to the survey errors upon which they may shed light.
Article
Deviations from reading survey questions exactly as worded may change the validity of the questions, thus increasing measurement error. Hence, organizations train their interviewers to read questions verbatim. To ensure interviewers are reading questions verbatim, organizations rely on interview recordings. However, this takes a significant amount of resources. Therefore, some organizations are using paradata generated by the survey software, specifically timestamps, to try to detect when interviewers’ deviate from reading the question verbatim. To monitor interviewers’ question reading behavior using timestamps, some organizations estimate the expected question administration time to establish a minimum and maximum question administration time thresholds (QATT). They then compare the question timestamp to the QATTs to identify questions that violate the questions’ QATTs. Violations of minimum QATTs may indicate interviewers omitted from the question text. Conversely, violations of maximum QATTs may indicate interviewers added words to the question text. The questions that violated the QATTs are then flagged for further investigation. Investigations may include such things as listening to the recording for said question or aggregating the data (i.e., the flagged questions) up to the interviewer level to identify interviewers who repeatedly engage in question-reading deviations. Organizations can then make decisions about training needs or disciplinary actions based on empirical data. However, there is no established method to calculate QATTs. Some organizations calculate QATTs by dividing the question words by an (x) reading pace (Sun & Meng, 2014) or a priori cutoff, such as one second (Mneimneh, Pennell, Lin, & Kelley, 2014). Further, there is little known about the level of accuracy of the methods currently used to detect question-reading deviations, or if a more accurate method is needed. Which QATT method is more accurate for detecting question-reading deviations? Should one construct QATTs using words per second (WPS) or use standard deviations of the mean reading-time? What WPS rate or standard deviation should be used? Is one detection method better for detecting certain types of deviations (e.g., skipping words or questions, adding words to the question, etc.)? This study attempts to answer the above questions using interview recordings and paradata from Wave 3 of the Understanding Society Innovation Panel, United Kingdom. Using interview recordings allows a direct comparison of the different detection methods to how the interviewers actually administered the question and measure the accuracy of each detection method. In addition the interview recordings are coded for the extent (i.e., minor or major) and type of deviation. This analysis gives better insight on the scope and types of deviations interviewers are engaging in and practical guidance on how to best detect deviations using paradata.
Article
As people increasingly adopt SMS text messaging for communicating in their daily lives, texting becomes a potentially important way to interact with survey respondents, who may expect that they can communicate with survey researchers as they communicate with others. Thus far our evidence from analyses of 642 iPhone interviews suggests that text interviewing can lead to higher quality data (less satisficing, more disclosure) than voice interviews on the same device, whether the questions are asked by an interviewer or an automated system. Respondents also report high satisfaction with text interviews, with many reporting that text is more convenient because they can continue with other activities while responding. But the interaction with an interviewer in a text interview is substantially different than in a voice interview, with much less of a sense of the interviewer’s social presence as well as quite different time pressure. In principle, this suggests there should be different potential for interviewer effects in text than in voice. In this paper we report analyses of how text interviews differed from voice interviews in our corpus, as well as how interviews with human interviewers differed from interviews with automated interviewing systems in both modes, based on transcripts and coding of multiple features of the interaction. Text interviews took more than twice as long as voice interviews, but the amount of time between turns (text messages) was large, and the total number of turns was two thirds as many as in voice interviews. As in the voice interviews, text interviews with human interviewers involved a small but significantly greater number of turns than text interviews with automated systems, not only because respondents engaged in small “talk” with human interviewers but because they requested clarification and help with the survey task more often than with the automated text interviewer. Respondents were more likely to type out full response options (as opposed to equally acceptable single character responses) with a human text interviewer. Analyses of the content and format of text interchanges compared to voice interchanges demonstrate both potential improvements in data quality and ease for respondents, but also pitfalls and challenges that a more asynchronous mode brings. The “anytime anywhere” qualities of text interviewing may reduce pressure to answer quickly, allowing respondents to answer more thoughtfully and to consult records even if they are mobile or multitasking. From a Total Survey Error perspective, the more streamlined nature of text interaction, which largely reduces the interview to its essential question-asking and -answering elements, may help reduce the potential for unintended interviewer influence.
Article
This article presents an analysis of interviewer effects on the process leading to cooperation or refusal in face-to-face surveys. The focus is on the interaction between the householder and the interviewer on the doorstep, including initial reactions from the householder, and interviewer characteristics, behaviors, and skills. In contrast to most previous research on interviewer effects, which analyzed final response behavior, the focus here is on the analysis of the process that leads to cooperation or refusal. Multilevel multinomial discrete-time event history modeling is used to examine jointly the different outcomes at each call, taking account of the influence of interviewer characteristics, call histories, and sample member characteristics. The study benefits from a rich data set comprising call record data (paradata) from several face-to-face surveys linked to interviewer observations, detailed interviewer information, and census records. The models have implications for survey practice and may be used in responsive survey designs to inform effective interviewer calling strategies.
Article
Growing rates of nonresponse to telephone surveys can contribute to nonresponse error, and interviewers contribute differentially to nonresponse. Why do some telephone interviewers have better response rates than others? This study uncovered a critical behavior of successful telephone interviewers over the course of introductions: responsive feedback. Using detailed coding of telephone introductions, I explored interviewers’ speech. Interviewers were most responsive to answerers in contacts that resulted in deferrals and least responsive in refusals. Practical applications for telephone interviewer training are discussed.
Article
Full-text available
When potential respondents consider whether or not to participate in a telephone interview, they have very little information about the interviewer, aside from what they hear over the phone. Yet interviewers vary widely in how often their invitations lead to participation, suggesting that potential respondents may give considerable weight not only to the content of such invitations, but the style, rhythm, phrasing, and other prosodic attributes of interviewers. We examine the impact of three prosodic attributes of interviewers: speech rate, pitch, and pausing, on the outcome of specific telephone survey invitations, agree-to-participate, scheduled-callback, and refusal, in a corpus of 1380 audio-recorded survey introductions (contacts). Agreement was highest when interviewers spoke at a moderate rate (3.5 words/sec) and paused at a moderate rate as well, at least once during the invitation but not more than about once every other conversational turn. The median interviewer pitch in successful contacts with both male and female interviewers was significantly lower than in refusals. However, variation in pitch functioned differently for each sex, with increased pitch variability more helpful for female interviewers but hurtful for male interviewers. We interpret the advantage of moderate speaking and pausing rates in this corpus as indicative of respondent preference for extemporaneous and competent deliveries, and dispreference for overly scripted deliveries.
Chapter
One class of paradata is interviewer call record data, sometimes referred to as call history data. For cross-sectional surveys call data may be recorded at each call to a sample unit. For longitudinal surveys, a potentially complicating factor is that such data can be recorded for all calls at a particular wave and across waves, leading to a wealth of data. The aim of this chapter is to introduce the reader to the analysis of call record data, explain the basic ideas of multilevel modeling, and highlight some of the advantages. It focuses on the use of multilevel event history analysis to analyze different outcomes across calls. The chapter discusses a number of example research questions to demonstrate how such data may be used and modeled. It focuses on call record data for both face-to-face and telephone interview surveys.
Article
Full-text available
Kish's (1962) classical intra-interviewer correlation (ρ int ) provides survey researchers with an estimate of the effect of interviewers on variation in measurements of a survey variable of interest. This correlation is an undesirable product of the data collection process that can arise when answers from respondents interviewed by the same interviewer are more similar to each other than answers from other respondents, decreasing the precision of survey estimates. Estimation of this parameter, however, uses only respondent data. The potential contribution of variance in nonresponse errors between interviewers to the estimation of ρ int has been largely ignored. Responses within interviewers may appear correlated because the interviewers successfully obtain cooperation from different pools of respondents, not because of systematic response deviations. This study takes a first step in filling this gap in the literature on interviewer effects by analyzing a unique survey data set, collected using computer-assisted telephone interviewing (CATI) from a sample of divorce records. This data set, which includes both true values and reported values for respondents and a CATI sample assignment that approximates interpenetrated assignment of subsamples to interviewers, enables the decomposition of interviewer variance in means of respondent reports into nonresponse error variance and measurement error variance across interviewers. We show that in cases where there is substantial interviewer variance in reported values, the interviewer variance may arise from nonresponse error variance across interviewers.
Article
The interviewers' voice, manner, personal characteristics and persuasion strategies are related to participation in telephone interviews. A comparison of groups reveals that the interviewer's manner and personal characteristics lead to higher influence to participate than the interviewers' voice. Vocal attributes are intercorrelated with personality. In addition, considerable differences were found in multivariate correlations between interviewers' voices on the one hand and personal characteristics on the other hand. Only three characteristics have significant effects: enthusiastic, personal speech and rate of speaking. Some other results of the persuasion strategies are: repeating of the recipient's arguments, tailoring and the formulation of different arguments. The findings point to the importance of a training program for interviewers.
Article
From a turn-takingperspective, simultaneous speech in conversation can be an aberration according to the one-speaker-at-a-time norm or a legitimate form of nondisruptive talk according to the all-together-now view. A coding system was devised that identified unambiguous instances of both of these types of simultaneous speech and tested whether the relative distribution of these two types of simultaneous speech varies according to the characteristics of conversation (Experiments 1 and 2) and whether the content of the two types of simultaneous speech differs (Experiment 3). The results from Experiments 1 and 2 confirmed that there are two distinct types of simultaneous speech coexisting in the same conversation and showed that changes in participant familiarity (but not topic orientation) were significantly related to variations in the relative distribution of the two types of simultaneous speech. The results of Experiment 3 showed that the content of the two types of simultaneous speech differs along a supportive (all-together-now simultaneous speech) and unsupportive (one-at-a-time simultaneous speech) dimension.
Article
Investigated the placement of auditor smile beginnings in the stream of dyadic interaction, using detailed transcriptions of the language, paralanguage, and body motion of the participants in 4 2-person conversations recorded on videotape. Auditor smile beginnings showed a strong tendency to occur at the same kinds of location as "back channel" responses (such as "yeah," "uh-huh," and head nods). This finding indicates that the smiles can function as a type of back channel. It is argued that smiles, like other forms of back channel, make communication more efficient by providing the speaker with feedback on a number of levels simultaneously. (21 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
This study examined the manner in which 10 specifically language-impaired (SLI) children and their linguistically normal chronological age (CA) and language age (LA) matched peers repaired overlapping speech. Conversational samples were elicited by an adult examiner from each subject. Instances of overlapping speech were analyzed as being either sentence initial or sentence internal (Gallagher & Craig, 1982). Both types of overlaps were then examined to determine if they required repair, and if so, how they were repaired. It was found that the proportional occurrence of both types of overlap was relatively similar across all three groups. Further, the frequency and nature of repair following sentence initial overlaps was similar across all three groups. However, SLI subjects produced a significantly greater number of unrepaired sentence internal overlaps than did either their CA or LA matched peers.
Article
The structure of speaker–auditor interaction during speaking turns was explored, using detailed transcriptions of language, paralanguage, and body-motion behaviors displayed by both participants in dyadic, face-to-face conversations. On the basis of certain observed regularities in these behaviors, three signals were hypothesized: (a) a speaker within-turn signal, (b) an auditor back-channel signal, and (c) a speaker continuation signal. These signals were composed of various behaviors in language and in body motion. It was further hypothesized that the display of appropriate ordered sequences of these signals by both participants, served to mark ‘units of interaction’ during speaking turns. (Conversational analysis; speaking turns; back-channel behaviors; interrelations of verbal and nonverbal behavior; American English (Chicago)).
Chapter
Introduction Research on the Role of Interviewers in Telephone Survey Response Rates The Role of Interviewer Voice Qualities in Telephone Survey Participation Decisions Research Design Analysis of Results Conclusions
Chapter
Introduction A Brief Introduction to Speech Recognition Speech Recognition Applied to IVR Systems Speech IVR Systems Compared to CATIs Speech IVR Systems Compared to Touchtone IVR Systems Conclusion References
Article
This paper examines whether the profusion of ums that so many speakers produce is noticed, and whether these ums influence what audiences think of speakers. Even though ums do not seem to be a product of anxiety or lack of preparation, the first study, using a simple questionnaire, indicated that the average listener assumes that they are. The second study manipulated um rates by editing a tape to create a version where ums were replaced by silence or were eliminated. The original and edited versions were played to audiences who were told to focus on either the content or the style, or were not given any particular instructions. Estimates of ums showed no sensitivity whatsoever in the content focus, some sensitivity without focus instruction, and greatest sensitivity with the style focus, suggesting that ums can be, but are not always, processed automatically. On subjective ratings of the speaker, filled pauses created a better impression than silent pauses, but no pauses proved best of all. The ums had an effect even in conditions where the audience was unable to report their presence.
Article
This paper analyzes prosodic variables in a corpus of eighteen oral presentations made by students of Technical English, all of whom were native speakers of Swedish. The focus is on the extent to which speakers were able to use their voices in a lively manner, and the hypothesis tested is that speakers who had high pitch variation as they spoke would be perceived as livelier speakers. A metric (termed PVQ), derived from the standard deviation in fundamental frequency, is proposed as a measure of pitch variation. Composite listener ratings of liveliness for nine 10-s samples of speech per speaker correlate strongly (r = .83, n = 18, p < .01) with the PVQ metric. Liveliness ratings for individual 10-s samples of speech show moderate but significant (n = 81, p < .01) correlations: r = .70 for males and r = .64 for females. The paper also investigates rate of speech and fluency variables in this corpus of L2 English. An application for this research is in presentation skills training, where computer feedback could be provided for speaking rate and the extent to which speakers have been able to use their voices in an engaging manner.
Article
Found significant differences between and within 8 normal and 7 clinic families with young children in behaviors, e.g., total number of times speaking, total and average duration of speech, number of times interrupted and interrupting, and incidences of simultaneous speech. Results, obtained from the families performing a series of experimental tasks, indicate that (a) there is more conflict in the clinic families, and (b) the normal family is characterized by a father dominance which appears to be accepted by the other members of the family, while the clinic family is characterized by a mother dominance which is unacceptable to the other family members. (15 ref.) (PsycINFO Database Record (c) 2006 APA, all rights reserved).
Article
Despite their frequency in conversational talk, little is known about how ums and uhs affect listeners' on-line processing of spontaneous speech. Two studies of ums and uhs in English and Dutch reveal that hearing an uh has a beneficial effect on listeners' ability to recognize words in upcoming speech, but that hearing an um has neither a beneficial nor a detrimental effect. The results suggest that um and uh are different from one another and support the hypothesis that uh is a signal of short upcoming delay and um is a signal of a long upcoming delay.