Article

Evaluating political parties: Criterion validity of open questions with requests for text and voice answers

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The rise of smartphone surveys, coupled with technological advancements, provide new ways for measuring respondents’ political attitudes. The use of open questions with requests for voice answers instead of text answers may simplify the answer process and provide nuanced information. So far, research comparing the measurement quality of text and voice answers is scarce. We therefore conducted an experiment in a smartphone survey (N = 2,402) to investigate the criterion validity of text and voice answers. Voice answers were collected using a JavaScript- and PHP-based voice recording tool that resembles the voice messaging function of Instant-Messaging Services. The results show that the open questions with requests for text and voice answers differ in terms of criterion validity. More specifically, the findings indicate that voice answers result in a somewhat higher criterion validity than their text counterparts. More refined research on the measurement quality of text and voice answers is required in order to draw robust conclusions.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In particular, smartphone sensors and apps allow researchers to collect new types of data, which can improve and expand survey measurement (Link et al., 2014), and offer the potential to reduce measurement errors, respondent burden and data collection costs (Jäckle et al., 2018). For example, GPS (McCool et al., 2021), accelerometers (Höhne & Schlosser, 2019;Höhne, Revilla, et al., 2020), web tracking applications and plug-ins (Bosch & Revilla, 2021bRevilla et al., 2017) and microphones (Gavras & Höhne, 2022;Revilla & Couper, 2021;Revilla et al., 2020), have already been used in (mobile) web survey research. ...
Article
Full-text available
Images might provide richer and more objective information than text answers to open‐ended survey questions. Little is known, nonetheless, about the consequences for data quality of asking participants to answer open‐ended questions with images. Therefore, this paper addresses three research questions: (1) What is the effect of answering web survey questions with images instead of text on breakoff, noncompliance with the task, completion time and question evaluation? (2) What is the effect of including a motivational message on these four aspects? (3) Does the impact of asking to answer with images instead of text vary across device types? To answer these questions, we implemented a 2 × 3 between‐subject web survey experiment (N = 3043) in Germany. Half of the sample was required to answer using PCs and the other half with smartphones. Within each device group, respondents were randomly assigned to (1) a control group answering open‐ended questions with text; (2) a treatment group answering open‐ended questions with images; and (3) another treatment group answering open‐ended questions with images but prompted with a motivational message. Results show that asking participants to answer with images significantly increases participants' likelihood of noncompliance as well as their completion times, while worsening their overall survey experience. Including motivational messages, moreover, moderately reduces the likelihood of noncompliance. Finally, the likelihood of noncompliance is similar across devices.
Article
The ever-growing number of respondents completing web surveys via smartphones is paving the way for leveraging technological advances to improve respondents’ survey experience and, in turn, the quality of their answers. Smartphone surveys enable researchers to incorporate audio and voice features into web surveys, that is, having questions read aloud to respondents using pre-recorded audio files and collecting voice answers via the smartphone’s microphone. Moving from written to audio and voice communication channels might be associated with several benefits, such as humanizing the communication process between researchers and respondents. However, little is known about respondents’ willingness to undergo this change in communication channels. Replicating and extending earlier research, we examine the extent to which respondents are willing to use audio and voice channels in web surveys, the reasons for their (non)willingness, and respondent characteristics associated with (non)willingness. The results of a web survey conducted in a nonprobability online panel in Germany ( N = 2146) reveal that more than 50% of respondents would be willing to have the questions read aloud (audio channel) and about 40% would also be willing to give answers via voice input (voice channel). While respondents mostly name a general openness to new technologies for their willingness, they mostly name preference for written communication for their nonwillingness. Finally, audio and voice channels in smartphone surveys appeal primarily to frequent and competent smartphone users as well as younger and tech-savvy respondents.
Article
The rapid increase in smartphone surveys and technological developments open novel opportunities for collecting survey answers. One of these opportunities is the use of open‐ended questions with requests for oral instead of written answers, which may facilitate the answer process and result in more in‐depth and unfiltered information. Whereas it is now possible to collect oral answers on smartphones, we still lack studies on the impact of this novel answer format on the characteristics of respondents' answers. In this study, we compare the linguistic and content characteristics of written versus oral answers to political attitude questions. For this purpose, we conducted an experiment in a smartphone survey (N = 2402) and randomly assigned respondents to an answer format (written or oral). Oral answers were collected via the open source ‘SurveyVoice (SVoice)’ tool, whereas written answers were typed in via the smartphone keypad. Applying length analysis, lexical structure analysis, sentiment analysis and structural topic models, our results reveal that written and oral answers differ substantially from each other in terms of lengths, structures, sentiments and topics. We find evidence that written answers are characterized by an intentional and conscious answering, whereas oral answers are characterized by an intuitive and spontaneous answering.
Article
Full-text available
Multidimensional concepts are non-compensatory when higher values on one component cannot offset lower values on another. Thinking of the components of a multidimensional phenomenon as non-compensatory rather than substitutable can have wide-ranging implications, both conceptually and empirically. To demonstrate this point, we focus on populist attitudes that feature prominently in contemporary debates about liberal democracy. Given similar established public opinion constructs, the conceptual value of populist attitudes hinges on its unique specification as an attitudinal syndrome, which is characterized by the concurrent presence of its non-compensatory concept subdimensions. Yet this concept attribute is rarely considered in existing empirical research. We propose operationalization strategies that seek to take the distinct properties of non-compensatory multidimensional concepts seriously. Evidence on five populism scales in 12 countries reveals the presence and consequences of measurement-concept inconsistencies. Importantly, in some cases, using conceptually sound operationalization strategies upsets previous findings on the substantive role of populist attitudes.
Article
Full-text available
As people increasingly communicate via asynchronous non-spoken modes on mobile devices, particularly text messaging (e.g., SMS), longstanding assumptions and practices of social measurement via telephone survey interviewing are being challenged. In the study reported here, 634 people who had agreed to participate in an interview on their iPhone were randomly assigned to answer 32 questions from US social surveys via text messaging or speech, administered either by a human interviewer or by an automated interviewing system. 10 interviewers from the University of Michigan Survey Research Center administered voice and text interviews; automated systems launched parallel text and voice interviews at the same time as the human interviews were launched. The key question was how the interview mode affected the quality of the response data, in particular the precision of numerical answers (how many were not rounded), variation in answers to multiple questions with the same response scale (differentiation), and disclosure of socially undesirable information. Texting led to higher quality data—fewer rounded numerical answers, more differentiated answers to a battery of questions, and more disclosure of sensitive information—than voice interviews, both with human and automated interviewers. Text respondents also reported a strong preference for future interviews by text. The findings suggest that people interviewed on mobile devices at a time and place that is convenient for them, even when they are multitasking, can give more trustworthy and accurate answers than those in more traditional spoken interviews. The findings also suggest that answers from text interviews, when aggregated across a sample, can tell a different story about a population than answers from voice interviews, potentially altering the policy implications from a survey.
Article
Full-text available
Although the purpose of questionnaire items is to obtain a person’s opinion on a certain matter, a respondent’s registered opinion may not reflect his or her “true” opinion because of random and systematic errors. Response styles (RSs) are a respondent’s tendency to respond to survey questions in certain ways regardless of the content, and they contribute to systematic error. They affect univariate and multivariate distributions of data collected by rating scales and are alternative explanations for many research results. Despite this, RS are often not controlled in research. This article provides a comprehensive summary of the types of RS, lists their potential sources, and discusses ways to diagnose and control for them. Finally, areas for further research on RS are proposed.
Article
Full-text available
In this paper, we investigate whether there are differences in the effect of instrument design between trained and fresh respondents. In three experiments, we varied the number of items on a screen, the choice of response categories, and the layout of a five-point rating scale. In general, effects of design carry over between trained and fresh respondents. We found little evidence that survey experience influences the question-answering process. Trained respondents seem to be more sensitive to satisficing. The shorter completion time, higher interitem correlations for multiple-item-per-screen formats, and the fact that they select the first response options more often indicate that trained respondents tend to take shortcuts in the response process and study the questions less carefully.
Conference Paper
Full-text available
SentimentWortschatz, or SentiWS for short, is a publicly available German-language resource for sentiment analysis, opinion mining etc. It lists positive and negative sentiment bearing words weighted within the interval of ( 1;1) plus their part of speech tag, and if applicable, their inflections. The current version of SentiWS (v1.8b) contains 1,650 negative and 1,818 positive words, which sum up to 16,406 positive and 16,328 negative word forms, respectively. It not only contains adjectives and adverbs explicitly expressing a sentiment, but also nouns and verbs implicitly containing one. The present work describes the resource's structure, the three sources utilised to assemble it and the semi-supervised method incorporated to weight the strength of its entries. Furthermore the resource's contents are extensively evaluated using a German-language evaluation set we constructed. The evaluation set is verified being reliable and its shown that SentiWS provides a beneficial lexical resource for German-language sentiment analysis related tasks to build on.
Article
Full-text available
Previous research has revealed techniques to improve response quality in open-ended questions in both paper and interviewer-administered survey modes. The purpose of this paper is to test the effectiveness of similar techniques in web surveys. Using data from a series of three random sample web surveys of Washington State University undergraduates, we examine the effects of visual and verbal answer-box manipulations (i.e., altering the size of the answer box and including an explanation that answers could exceed the size of the box) and the inclusion of clarifying and motivating introductions in the question stem. We gauge response quality by the amount and type of information contained in responses as well as response time and item nonresponse. The results indicate that increasing the size of the answer box has little effect on early responders to the survey but substantially improved response quality among late responders. Including any sort of explanation or introduction that made response quality and length salient also improved response quality for both early and late responders. In addition to discussing these techniques, we also address the potential of the web survey mode to revitalize the use of open-ended questions in self-administered surveys.
Article
More and more respondents are answering web surveys using mobile devices. Mobile respondents tend to provide shorter responses to open questions than PC respondents. Using voice recording to answer open-ended questions could increase data quality and help engage groups usually underrepresented in web surveys. Revilla, Couper, Bosch, and Asensio showed that in particular the use of voice recording still presents many challenges, even if it could be a promising tool. This article reports results from a follow-up experiment in which the main goals were to (1) test whether different instructions on how to use the voice recording tool reduce technical and understanding problems, and thereby reduce item nonresponse while preserving data quality and the evaluation of the tool; (2) test whether nonresponse due to context can be reduced by using a filter question, and how this affects data quality and the tool evaluation; and (3) understand which factors affect nonresponse to open-ended questions using voice recording, and if these factors also affect data quality and the evaluation of the tool. The experiment was implemented within a smartphone web survey in Spain focused on Android devices. The results suggest that different instructions did not affect nonresponse to the open questions and had little effect on data quality for those who did answer. Introducing a filter to ensure that people were in a setting that permits voice recording seems useful. Despite efforts to reduce problems, a substantial proportion of respondents are still unwilling or unable to answer open questions using voice recording.
Article
The analysis of political texts from parliamentary speeches, party manifestos, social media, or press releases forms the basis of major and growing fields in political science, not least since advances in “text-as-data” methods have rendered the analysis of large text corpora straightforward. However, a lot of sources of political speech are not regularly transcribed, and their on-demand transcription by humans is prohibitively expensive for research purposes. This class includes political speech in certain legislatures, during political party conferences as well as television interviews and talk shows. We showcase how scholars can use automatic speech recognition systems to analyze such speech with quantitative text analysis models of the “bag-of-words” variety. To probe results for robustness to transcription error, we present an original “word error rate simulation” (WERSIM) procedure implemented in $R$ . We demonstrate the potential of automatic speech recognition to address open questions in political science with two substantive applications and discuss its limitations and practical challenges.
Article
We implemented an experiment within a smartphone web survey to explore the feasibility of using voice input (VI) options. Based on device used, participants were randomly assigned to a treatment or control group. Respondents in the iPhone operating system (iOS) treatment group were asked to use the dictation button, in which the voice was translated automatically into text by the device. Respondents with Android devices were asked to use a VI button which recorded the voice and transmitted the audio file. Both control groups were asked to answer open-ended questions using standard text entry. We found that the use of VI still presents a number of challenges for respondents. Voice recording (Android) led to substantially higher nonresponse, whereas dictation (iOS) led to slightly higher nonresponse, relative to text input. However, completion time was significantly reduced using VI. Among those who provided an answer, when dictation was used, we found fewer valid answers and less information provided, whereas for voice recording, longer and more elaborated answers were obtained. Voice recording (Android) led to significantly lower survey evaluations, but not dictation (iOS).
Article
Spatial models of issue voting generally assume that citizens have a single “vote function”. A given voter is expected to evaluate all parties using the same issue criteria. The impact of issues can vary between citizens and contexts, but is normally considered to be constant across parties. This paper reassesses this central assumption, by suggesting that party characteristics influence the salience of issue considerations in voters’ evaluations. Voters should rely more strongly on issues which are frequently associated with a given party and for which its issue stances are better known. Our analysis of the 2014 European elections supports these hypotheses by showing that the impact of voter-party issue distances on party evaluations is systematically related to the clarity and extremism of parties’ issue positions, as well as to their size and governmental status. These findings imply an important modification of standard proximity models of electoral competition and party preferences.
Article
Mobile coverage recently has reached an all-time high, and in most countries, high-speed Internet connections are widely available. Due to technological development, smartphones and tablets have become increasingly popular. Accordingly, we have observed an increasing use of mobile devices to complete web surveys and, hence, survey methodologists have shifted their attention to the challenges that stem from this development. The present study investigated whether the growing use of smartphones has decreased how systematically this choice of device varies between groups of respondents (i.e., how selective smartphone usage for completing web surveys is). We collected a data set of 18,520 respondents from 18 web surveys that were fielded in Germany between 2012 and 2016. Based on these data, we show that while the use of smartphones to complete web surveys has considerably increased over time, selectivity with respect to using this device has remained stable.
Article
Scholars estimating policy positions from political texts typically code words or sentences and then build left-right policy scales based on the relative frequencies of text units coded into different categories. Here we reexamine such scales and propose a theoretically and linguistically superior alternative based on the logarithm of odds-ratios. We contrast this scale with the current approach of the Comparative Manifesto Project (CMP), showing that our proposed logit scale avoids widely acknowledged flaws in previous approaches. We validate the new scale using independent expert surveys. Using existing CMP data, we show how to estimate more distinct policy dimensions, for more years, than has been possible before, and make this dataset publicly available. Finally, we draw some conclusions about the future design of coding schemes for political texts.
Cloud speech-to-text API
  • Google