Article

Public Opinion and the Classical Tradition: Redux in the Digital Age

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Digital trace data have the potential to offer rich insight into complex behaviors that were once out of reach, but their use has raised vital and unresolved questions about what is—or is not—public opinion. Building on the work of James Bryce, Lindsay Rogers, Herbert Blumer, Paul Lazarsfeld, and more, this essay revisits the discipline’s historical roots and draws parallels between past theory and present practice. Today, scholars treat public opinion as the summation of individual attitudes, weighted equally and expressed anonymously at static points in time through polls, yet prior to the advent of survey research, it was conceived as something intrinsically social and dynamic. In an era dominated by online discussion boards and social media platforms, the insights of this earlier “classical tradition” offer two pathways forward. First, for those who criticize computational social science as poorly theorized, it provides a strong justification for the work that data scientists do in text mining and sentiment analysis. And second, it offers clues for how emerging technologies might be leveraged effectively for the study of public opinion in the future.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Regarding each of these topics, policies, and personas, a majority and minority position can be identified at a given point in time. Despite its dominance, this view of public opinion has faced recurring criticism for its narrow scope, as well as its failure to account for the inherently social and dynamic nature of public opinion (Guber, 2021;Herbst, 1993;Lewis, 2001). In contrast, broader approaches to public opinion emphasize political attitudes, beliefs, and values at the core of the phenomenon (Howe & Krosnick, 2022;Price & Neijens, 1997). ...
Article
Full-text available
Experimental studies show that public opinion cues shape people’s perceptions of public opinion. However, the full extent of political information these cues comprise and, consequently, communicate remains unexplored. This study utilizes standardized content analysis to examine the political values in applauded statements ( n = 1,722) from Ukraine’s major political talk shows. The results suggest that applauded statements embody a complex mix of identity- and governance-oriented values, signaling to the broader audience the preferred type of political debate among those within the studio. The study enhances the literature on public opinion cues by demonstrating the significance of analyzing their underlying content.
... Given the urgency to address environmental concerns, real-time insights that can be quickly interpreted are invaluable. As Guber (2021) notes, our ''hyper-networked world'' allows for significant contributions to political conversations, transforming public opinion into a form of ''digital democracy''. ...
... The chapter is about research on passive public opinion using instruments based on the 'one citizen one vote' principle, that is, all citizens having the same chance to be heard. Excluded is research on self-selected citizens, be they visitors to cafes or attendees at public meetings, writers of 'letters to the editor', protesters or social media users, thus excluding historical research of the public sphere and present-day data-mining of digital traces (see Guber, 2021;Splichal, 2022). ...
Chapter
Full-text available
Representative polls have been advocated as neutral, scientifically legitimised instruments to translate citizens’ concerns, needs and wishes into public priorities and political options; but they have also been criticised as tools for politicians to manipulate the population. This chapter looks at these intermediary roles of polls and of more qualitative and interactive public opinion research and how this research has become intertwined with democratic innovations. The consultation practices of the European Union demonstrate the risks of these developments. However, from a civic engagement perspective, checking the representativeness claims of political actors and facilitating public deliberation are still relevant intermediary roles for public opinion research. They might be hard to achieve, but it is worth trying for a more responsive polity and a more reflexive civil society.
... Researchers have tried to use social media data to interpolate and extend public opinion estimates data to situations with insufficient polling (Beauchamp, 2017) or focused on the recovery of trends with a recognition that levels of support of a given candidate or party are unfeasible to recover (Bovet et al., 2018) which then be meaningful if they are anchored on survey estimates. This understanding that what social media can capture is "relative" rather than "absolute" public opinion (Gong et al., 2020) is consistent with the idea that surveys and digital platforms capture fundamentally different ways of looking at public opinion (Guber, 2021) and is also more promising than social media-only approaches. ...
Preprint
Full-text available
Electoral prediction and social media met in 2010. The reasoning was straightforward: people use social media to express their opinions and, therefore, it should be possible to use that information to capture the mood of the public and even predict electoral results. However, in spite of many attempts, the subfield has still not produced a commonly accepted, generally valid method. In this Chapter, we discuss the original foundations of the field, and the limitations that still pervade it. We outline some of the reasons why this program has generally remained stuck in the same original challenges that were identified ten years ago.
... Before the advent of survey research, public opinion was observed when individuals expressed their views through newspapers at public meetings, rallies, and torchlight parades. At present, scholars always treat public opinion as the summation of individual attitudes, weighted equally and expressed anonymously at static points in time through polls Guber, 2021). Public opinion polls provide voters with information to decide whether to vote or abstain in many countries (Großer & Schram, 2010). ...
... This article is not so much an analysis of British public opinion on Brexit (cf. Curtice 2018;Guber 2021;Hobolt and Tilley 2021) as an attempt to complement this picture by determining how the outcomes of Brexit are perceived by Polish migrants settled in the UK, who constitute an important part of British society (White 2016). ...
Article
Full-text available
The aim of the article is to present the opinions of Polish migrants in Britain on the gains or losses that Brexit may bring to the European Union (EU), the United Kingdom (UK), and Poland, as well as the respondents themselves and their families. These opinions were determined based on the analysis of the results of a survey carried out among these migrants and presented against the backdrop of the results of public opinion polls on EU membership, which have been conducted in the British Isles regularly since the 1970s. The article analyses the beliefs held on this issue by economic migrants, who are faced with a choice as Brexit is underway: to remain expatriates or to return to their country of origin. Among the answers to questions about the possible benefits or negative outcomes of Brexit, it was the latter that predominated. In the discussion, the authors seek to ascertain why migrants from Poland fear the negative consequences of Brexit for the UK and for Europe more often than they fear those for Poland or for themselves and their close family members.
Article
Terms such as ‘fake news’ and ‘post-truth’ circulate freely today within the popular lexicon. It is an environment where objective facts have ‘become less influential in shaping public opinion than appeals to emotion and personal belief’ (OED). Central here is to understand the conceptual grounding of subjective opinion as a historically specific epistemological structure of social communication. My paper will draw on the Hegelian tradition of critical theory that has in unique ways unified an analysis of the nexus between socio-economic structures and epistemological frameworks. Here I name opinion as a historically specific epistemological structure of self-certainty, which receives validation within what Adorno called the Halbbildung of industrial culture, a form of social consciousness cultivated by the spread of information and economic imperative. It will be argued that the concept of opinion becomes a vital question for understanding, in this ‘post-truth’ landscape, current standards of instantaneous communication and cultural transmission.
Article
Full-text available
Background COVID-19 is a scientifically and medically novel disease that is not fully understood because it has yet to be consistently and deeply studied. Among the gaps in research on the COVID-19 outbreak, there is a lack of sufficient infoveillance data. Objective The aim of this study was to increase understanding of public awareness of COVID-19 pandemic trends and uncover meaningful themes of concern posted by Twitter users in the English language during the pandemic. Methods Data mining was conducted on Twitter to collect a total of 107,990 tweets related to COVID-19 between December 13 and March 9, 2020. The analyses included frequency of keywords, sentiment analysis, and topic modeling to identify and explore discussion topics over time. A natural language processing approach and the latent Dirichlet allocation algorithm were used to identify the most common tweet topics as well as to categorize clusters and identify themes based on the keyword analysis. Results The results indicate three main aspects of public awareness and concern regarding the COVID-19 pandemic. First, the trend of the spread and symptoms of COVID-19 can be divided into three stages. Second, the results of the sentiment analysis showed that people have a negative outlook toward COVID-19. Third, based on topic modeling, the themes relating to COVID-19 and the outbreak were divided into three categories: the COVID-19 pandemic emergency, how to control COVID-19, and reports on COVID-19. Conclusions Sentiment analysis and topic modeling can produce useful information about the trends in the discussion of the COVID-19 pandemic on social media as well as alternative perspectives to investigate the COVID-19 crisis, which has created considerable public awareness. This study shows that Twitter is a good communication channel for understanding both public concern and public awareness about COVID-19. These findings can help health departments communicate information to alleviate specific public concerns about the disease.
Article
Full-text available
Background Since the beginning of the coronavirus disease 2019 (COVID-19) epidemic, misinformation has been spreading uninhibited over traditional and social media at a rapid pace. We sought to analyze the magnitude of misinformation that is being spread on Twitter (Twitter, Inc., San Francisco, CA) regarding the coronavirus epidemic. Materials and methods We conducted a search on Twitter using 14 different trending hashtags and keywords related to the COVID-19 epidemic. We then summarized and assessed individual tweets for misinformation in comparison to verified and peer-reviewed resources. Descriptive statistics were used to compare terms and hashtags, and to identify individual tweets and account characteristics. Results The study included 673 tweets. Most tweets were posted by informal individuals/groups (66%), and 129 (19.2%) belonged to verified Twitter accounts. The majority of included tweets contained serious content (91.2%); 548 tweets (81.4%) included genuine information pertaining to the COVID-19 epidemic. Around 70% of the tweets tackled medical/public health information, while the others were pertaining to sociopolitical and financial factors. In total, 153 tweets (24.8%) included misinformation, and 107 (17.4%) included unverifiable information regarding the COVID-19 epidemic. The rate of misinformation was higher among informal individual/group accounts (33.8%, p: <0.001). Tweets from unverified Twitter accounts contained more misinformation (31.0% vs 12.6% for verified accounts, p: <0.001). Tweets from healthcare/public health accounts had the lowest rate of unverifiable information (12.3%, p: 0.04). The number of likes and retweets per tweet was not associated with a difference in either false or unverifiable content. The keyword “COVID-19” had the lowest rate of misinformation and unverifiable information, while the keywords “#2019_ncov” and “Corona” were associated with the highest amount of misinformation and unverifiable content respectively. Conclusions Medical misinformation and unverifiable content pertaining to the global COVID-19 epidemic are being propagated at an alarming rate on social media. We provide an early quantification of the magnitude of misinformation spread and highlight the importance of early interventions in order to curb this phenomenon that endangers public safety at a time when awareness and appropriate preventive actions are paramount.
Article
Full-text available
This article compares digital arenas such as Twitter with the principles prescribed by the bourgeois public sphere, to examine how close or far these arenas are from Habermas’ original concept. By focusing on one of the criteria, the current influence of elites on political debate, it discusses the Habermasian principles of general accessibility and non-dominance of the elites as prerequisites for a functioning public sphere. This study finds that even though there are few access restrictions on Twitter and despite the fact that no one, in principle, is excluded from the platform, there is no apparent elimination of privileges and the elites maintain their elite status within its borders. Methodologically, the article relies on empirical research of hashtagged exchanges on Twitter during the General Elections in the United Kingdom in 2015. Through the mapping of Twitter as a synthesis of dialogic arenas, it explores the elite-focused discourse and the vocal actors in the stream, underscoring that the presence of the elites, even in an indirect way. Drawing on these elements, the article argues for a reconceptualization of the normative perception of the public sphere, suggesting the notion of exclusion is a complex issue that includes expanding notions of publics to also include those topics being discussed. Finally, it focuses on the significance of journalism in relation to political dialogue and argues that the move towards less elite-centered arenas largely depends on journalism.
Article
Full-text available
Natural hazards are becoming increasingly expensive as climate change and development are exposing communities to greater risks. Preparation and recovery are critical for climate change resilience, and social media are being used more and more to communicate before, during, and after disasters. While there is a growing body of research aimed at understanding how people use social media surrounding disaster events, most existing work has focused on a single disaster case study. In the present study, we analyze five of the costliest disasters in the last decade in the United States (Hurricanes Irene and Sandy, two sets of tornado outbreaks, and flooding in Louisiana) through the lens of Twitter. In particular, we explore the frequency of both generic and specific food-security related terms, and quantify the relationship between network size and Twitter activity during disasters. We find differences in tweet volume for keywords depending on disaster type, with people using Twitter more frequently in preparation for Hurricanes, and for real-time or recovery information for tornado and flooding events. Further, we find that people share a host of general disaster and specific preparation and recovery terms during these events. Finally, we find that among all account types, individuals with “average” sized networks are most likely to share information during these disasters, and in most cases, do so more frequently than normal. This suggests that around disasters, an ideal form of social contagion is being engaged in which average people rather than outsized influentials are key to communication. These results provide important context for the type of disaster information and target audiences that may be most useful for disaster communication during varying extreme events.
Article
Full-text available
This paper aims to contribute to the development of tools to support an analysis of Big Data as manifestations of social processes and human behaviour. Such a task demands both an understanding of the epistemological challenge posed by the Big Data phenomenon and a critical assessment of the offers and promises coming from the area of Big Data analytics. This paper draws upon the critical social and data scientists’ view on Big Data as an epistemological challenge that stems not only from the sheer volume of digital data but, predominantly, from the proliferation of the narrow-technological and the positivist views on data. Adoption of the social-scientific epistemological stance presupposes that digital data was conceptualised as manifestations of the social. In order to answer the epistemological challenge, social scientists need to extend the repertoire of social scientific theories and conceptual frameworks that may inform the analysis of the social in the age of Big Data. However, an ‘epistemological revolution’ discourse on Big Data may hinder the integration of the social scientific knowledge into the Big Data analytics.
Article
Full-text available
Evidence increasingly suggests that neural structures that respond to primary and secondary rewards are also implicated in the processing of social rewards. The "Like" - a popular feature on social media - shares features with both monetary and social rewards as a means of feedback that shapes reinforcement learning. Despite the ubiquity of the Like, little is known about the neural correlates of providing this feedback to others. In the present study, we mapped the neural correlates of providing Likes to others on social media. Fifty-eight adolescents and young adults completed a task in the MRI scanner designed to mimic the social photo-sharing app Instagram. We examined neural responses when participants provided positive feedback to others. The experience of providing Likes to others on social media related to activation in brain circuity implicated in reward, including the striatum and ventral tegmental area, regions also implicated in the experience of receiving Likes from others. Providing Likes was also associated with activation in brain regions involved in salience processing and executive function. We discuss the implications of these findings for our understanding of the neural processing of social rewards, as well as the neural processes underlying social media use.
Article
Full-text available
A growing social science literature has used Twitter and Facebook to study political and social phenomena including for election forecasting and tracking political conversations. This research note uses a nationally representative probability sample of the British population to examine how Twitter and Facebook users differ from the general population in terms of demographics, political attitudes and political behaviour. We find that Twitter and Facebook users differ substantially from the general population on many politically relevant dimensions including vote choice, turnout, age, gender, and education. On average social media users are younger and better educated than non-users, and they are more liberal and pay more attention to politics. Despite paying more attention to politics, social media users are less likely to vote than non-users, but they are more likely to support the left leaning Labour Party when they do vote. However, we show that these apparent differences mostly arise due to the demographic composition of social media users. After controlling for age, gender, and education, no statistically significant differences arise between social media users and non-users on political attention, values or political behaviour.
Article
Full-text available
Social media are an important source of information about the political issues, reflecting, as well as influencing, public mood. We present an analysis of Twitter data, collected over 6 weeks before the Brexit referendum, held in the UK in June 2016. We address two questions: what is the relation between the Twitter mood and the referendum outcome, and who were the most influential Twitter users in the pro- and contra-Brexit camps? First, we construct a stance classification model by machine learning methods, and are then able to predict the stance of about one million UK-based Twitter users. The demography of Twitter users is, however, very different from the demography of the voters. By applying a simple age-adjusted mapping to the overall Twitter stance, the results show the prevalence of the pro-Brexit voters, something unexpected by most of the opinion polls. Second, we apply the Hirsch index to estimate the influence, and rank the Twitter users from both camps. We find that the most productive Twitter users are not the most influential, that the pro-Brexit camp was four times more influential, and had considerably larger impact on the campaign than the opponents. Third, we find that the top pro-Brexit communities are considerably more polarized than the contra-Brexit camp. These results show that social media provide a rich resource of data to be exploited, but accumulated knowledge and lessons learned from the opinion polls have to be adapted to the new data sources.
Article
Full-text available
The main aim of this study was to examine the norms of expressing emotions on social media. Specifically, the perceived appropriateness (i.e. injunctive norms) of expressing six discrete emotions (i.e. sadness, anger, disappointment, worry, joy, and pride) was investigated across four different social media platforms. Drawing on data collected in March 2016 among 1201 young Dutch users (15–25 years), we found that positive expressions were generally perceived as more appropriate than negative expressions across all platforms. In line with the objective of the study, some platform differences were found. The expression of negative emotions was rated as most appropriate for WhatsApp, followed by Facebook, Twitter, and Instagram. For positive emotion expression, perceived appropriateness was highest for WhatsApp, followed by Instagram, Facebook, and Twitter. Additionally, some gender differences were found, while age showed little variations. Overall, the results contribute to a more informed understanding of emotion expression online.
Article
Full-text available
The availability and usability of massive data sets have added to the increasing popularity of big data research. However, common mechanisms of big data collection (e.g., social media, open source platforms, and other online user data) can be problematic. Sampling issues, especially selection bias, associated with these data sources can have far reaching implications for analysis and interpretation. This paper examines the types of sampling issues that arise in big data projects, how and why biases occur, and their implications. It concludes by providing strategies for dealing with sampling and selection bias in big data projects.
Article
Full-text available
In the last decade advances of computer technologies have lead towards a technological reality where the line between information consumers and information producers is blurred. This technological omnipresence allows for unprecedented data creation capabilities. Based on various data sources, it seems humans have fully embraced data-generating activities. One such activity is using online social network applications, like Facebook or Twitter in almost all aspects of their lives. One of the main features of online social network applications is perceived freedom of speech, individuality and privacy, even though every application has some special features. Therefore, content generated using these services presents the public with interesting insights in private life of people and their attitudes towards public affairs. Social network applications are active the most during specific public events aimed at the massive public. Due to its brevity, ease of use and frequency, Twitter is an interesting social network application for research and analysis. Other than allowing almost exclusively short messages (up to 140 characters), a tweet (a Twitter post) can contain location of the message sender as well as a graphic to accompany the textual message. The textual part of the message may contain so called hashtags – keywords used for indexing and easy identification of a subject the message is related to. These hashtags allow us to group messages related to a specific event. Recent governmental elections held in Croatia were very popular amongst the Croatian Twitter community. Usage of hashtags allowed us to identify the right messages and thus most-used words to describe this event and potentially identify how people felt when talking, i.e. writing, about politics and the held elections. Furthermore, geolocation information, optionally embedded in a tweet, makes it possible to analyze which keywords were used in which parts of Croatia, all pertaining to the held governmental elections. Public opinion is only a few tweets away, but are the results similar to the election results? If tweet-based opinion can be constructed, does it differ from the real results? These and similar questions will be addressed in this paper.
Article
Full-text available
News media regularly include the voice of the “man or woman in the street” alongside that of the actors involved in a news story. Journalists use these vox pops to give an impression of public opinion. With the coming of social media, access to people’s opinions has never been so easy. Little research exists about how social media (Twitter in particular) are used by journalists to describe public opinion. This is the question this research aims to answer by using a combination of a qualitative and a quantitative content analysis of Dutch and Flemish news websites. We found several patterns regarding the use of Twitter vox pops. First, we found Twitter to be regularly used as a representation of public opinion. Second, many items generalised these opinions to larger groups with strong—mostly negative—emotions. Third, when referring to Twitter, the articles used (abstract) quantifiers and hyperbolic terms (commotion, explosions) to imply an objective basis for these inferences about the “vox Twitterati”.
Article
Full-text available
Demonstrations that analyses of social media content can align with measurement from sample surveys have raised the question of whether survey research can be supplemented or even replaced with less costly and burdensome data mining of already-existing or “found” social media content. But just how trustworthy such measurement can be—say, to replace official statistics—is unknown. Survey researchers and data scientists approach key questions from starting assumptions and analytic traditions that differ on, for example, the need for representative samples drawn from frames that fully cover the population. New conversations between these scholarly communities are needed to understand the potential points of alignment and non-alignment. Across these approaches, there are major differences in (a) how participants (survey respondents and social media posters) understand the activity they are engaged in; (b) the nature of the data produced by survey responses and social media posts, and the inferences that are legitimate given the data; and (c) practical and ethical considerations surrounding the use of the data. Estimates are likely to align to differing degrees depending on the research topic and the populations under consideration, the particular features of the surveys and social media sites involved, and the analytic techniques for extracting opinions and experiences from social media. Traditional population coverage may not be required for social media content to effectively predict social phenomena to the extent that social media content distills or summarizes broader conversations that are also measured by surveys.
Article
Full-text available
There is a large body of research on utilizing online activity as a survey of political opinion to predict real world election outcomes. There is considerably less work, however, on using this data to understand topic-specific interest and opinion amongst the general population and specific demographic subgroups, as currently measured by relatively expensive surveys. Here we investigate this possibility by studying a full census of all Twitter activity during the 2012 election cycle along with the comprehensive search history of a large panel of Internet users during the same period, highlighting the challenges in interpreting online and social media activity as the results of a survey. As noted in existing work, the online population is a non-representative sample of the offline world (e.g., the U.S. voting population). We extend this work to show how demographic skew and user participation is non-stationary and difficult to predict over time. In addition, the nature of user contributions varies substantially around important events. Furthermore, we note subtle problems in mapping what people are sharing or consuming online to specific sentiment or opinion measures around a particular topic. We provide a framework, built around considering this data as an imperfect continuous panel survey, for addressing these issues so that meaningful insight about public interest and opinion can be reliably extracted from online and social media data.
Article
Full-text available
Opinion leaders can be influential in persuading their peers about news and politics, yet their potential influence has been questioned in the social media era. This study tests a theoretical model of attempts at political persuasion within social media in which highly active users (‘‘prosumers’’) consider themselves opinion leaders, which subsequently increases efforts to try and change others’ political attitudes and behaviors. Using two-wave U.S. panel survey data (W1, 1,816;W2, 1,024), we find prosumers believe they are highly influential in their social networks and are both directly and indirectly more likely to try to persuade others. Our results highlight one theoretical mechanism through which engaged social media users attempt to persuade others and suggest personal influence remains viable within social media.
Article
Full-text available
Recent years have seen an increase in the amount of statistics describing different phenomena based on “Big Data.” This term includes data characterized not only by their large volume, but also by their variety and velocity, the organic way in which they are created, and the new types of processes needed to analyze them and make inference from them. The change in the nature of the new types of data, their availability, and the way in which they are collected and disseminated is fundamental. This change constitutes a paradigm shift for survey research. There is great potential in Big Data, but there are some fundamental challenges that have to be resolved before its full potential can be realized. This report provides examples of different types of Big Data and their potential for survey research; it also describes the Big Data process, discusses its main challenges, and considers solutions and research needs.
Article
Full-text available
Public opinion research is entering a new era, one in which traditional survey research may play a less dominant role. The proliferation of new technologies, such as mobile devices and social-media platforms, is changing the societal landscape across which public opinion researchers operate. As these technologies expand, so does access to users� thoughts, feelings, and actions expressed instantaneously, organically, and often publicly across the platforms they use. The ways in which people both access and share information about opinions, attitudes, and behaviors have gone through a greater transformation in the past decade than perhaps in any previous point in history, and this trend appears likely to continue. The ubiquity of social media and the opinions users express on social media provide researchers with new data-collection tools and alternative sources of qualitative and quantitative information to augment or, in some cases, provide alternatives to more traditional data-collection methods. The reasons to consider social media in public opinion and survey research are no different than those of any alternative method. We are ultimately concerned with answering research questions, and this often requires the collection of data in one form or another. This may involve the analysis of data to obtain qualitative insights or quantitative estimates. The quality of data and the ability to help accurately answer research questions are of paramount concern. Other practical considerations include the cost efficiency of the method and the speed at which the data can be collected, analyzed, and disseminated. If the combination of data quality, cost efficiency, and timeliness required by a study can best be achieved through the use of social media, then there is reason to consider these methods for research. An additional reason to consider social media in public opinion and survey research is its explosion in popularity over the past several years. �
Article
Full-text available
The consequences of anthropogenic climate change are extensively debated through scientific papers, newspaper articles, and blogs. Newspaper articles may lack accuracy, while the severity of findings in scientific papers may be too opaque for the public to understand. Social media, however, is a forum where individuals of diverse backgrounds can share their thoughts and opinions. As consumption shifts from old media to new, Twitter has become a valuable resource for analyzing current events and headline news. In this research, we analyze tweets containing the word "climate" collected between September 2008 and July 2014. We determine how collective sentiment varies in response to climate change news, events, and natural disasters. Words uncovered by our analysis suggest that responses to climate change news are predominately from climate change activists rather than climate change deniers, indicating that Twitter is a valuable resource for the spread of climate change awareness.
Article
A new breed of researcher is turning to computation to understand society — and then change it. A new breed of researcher is turning to computation to understand society — and then change it.
Article
There is interest in using social media content to supplement or even substitute for survey data. In one of the studies to test the feasibility of this idea, O’Connor, Balasubramanyan, Routledge, and Smith report reasonably high correlations between the sentiment of tweets containing the word “jobs” and survey-based measures of consumer confidence in 2008–2009. Other researchers report a similar relationship through 2011, but after that time it is no longer observed, suggesting such tweets may not be as promising an alternative to survey responses as originally hoped. But, it’s possible that with the right analytic techniques, the sentiment of “jobs” tweets might still be an acceptable alternative. To explore this, we first classify “jobs” tweets into categories whose content is either related to employment or not, to see whether sentiment of the former correlates more highly with a survey-based measure of consumer sentiment. We then compare the relationship when sentiment is determined with traditional dictionary-based methods versus newer machine learning-based tools developed for Twitter-like texts. We calculated daily sentiment in three different ways and used a measure of association less sensitive to outliers than correlation. None of these approaches improved the size of the relationship in the original or more recent data. We found that the many micro-decisions these analyses require, such as the size of the smoothing interval and the length of the lag between the two series, can significantly affect the outcomes. In the end, despite the earlier promise of tweets as an alternative to survey responses, we find no evidence that the original relationship in these data was more than a chance occurrence.
Article
Public opinion, as necessary a concept it is to the underpinnings of democracy, is a socially constructed representation of the public that is forged by the methods and data from which it is derived, as well as how it is understood by those tasked with evaluating and utilizing it. I examine how social media manifests as public opinion in the news and how these practices shape journalistic routines. I draw from a content analysis of news stories about the 2016 US election, as well as interviews with journalists, to shed light on evolving practices that inform the use of social media to represent public opinion. I find that despite social media users not reflecting the electorate, the press reported online sentiments and trends as a form of public opinion that services the horserace narrative and complements survey polling and vox populi quotes. These practices are woven into professional routines – journalists looked to social media to reflect public opinion, especially in the wake of media events like debates. Journalists worried about an overreliance on social media to inform coverage, especially Dataminr alerts and journalists’ own highly curated Twitter feeds. Hybrid flows of information between journalists, campaigns, and social media companies inform conceptions of public opinion.
Chapter
This chapter examines the use of social networking sites such as Twitter in measuring public opinion. It first considers the opportunities and challenges that are involved in conducting public opinion surveys using social media data. Three challenges are discussed: identifying political opinion, representativeness of social media users, and aggregating from individual responses to public opinion. The chapter outlines some of the strategies for overcoming these challenges and proceeds by highlighting some of the novel uses for social media that have fewer direct analogs in traditional survey work. Finally, it suggests new directions for a research agenda in using social media for public opinion work.
Article
Finding facts about fake news There was a proliferation of fake news during the 2016 election cycle. Grinberg et al. analyzed Twitter data by matching Twitter accounts to specific voters to determine who was exposed to fake news, who spread fake news, and how fake news interacted with factual news (see the Perspective by Ruths). Fake news accounted for nearly 6% of all news consumption, but it was heavily concentrated—only 1% of users were exposed to 80% of fake news, and 0.1% of users were responsible for sharing 80% of fake news. Interestingly, fake news was most concentrated among conservative voters. Science , this issue p. 374 ; see also p. 348
Article
Researchers hoping to make inferences about social phenomena using social media data need to answer two critical questions: What is it that a given social media metric tells us? And who does it tell us about? Drawing from prior work on these questions, we examine whether Twitter sentiment about Barack Obama tells us about Americans’ attitudes toward the president, the attitudes of particular subsets of individuals, or something else entirely. Specifically, using large-scale survey data, this study assesses how patterns of approval among population subgroups compare to tweets about the president. The findings paint a complex picture of the utility of digital traces. Although attention to subgroups improves the extent to which survey and Twitter data can yield similar conclusions, the results also indicate that sentiment surrounding tweets about the president is no proxy for presidential approval. Instead, after adjusting for demographics, these two metrics tell similar macroscale, long-term stories about presidential approval but very different stories at a more granular level and over shorter time periods.
Article
Can Twitter data complement or supplement measures of economic confidence? This possibility was proposed in early work suggesting that sentiment surrounding the word "jobs" on Twitter closely tracked survey measures of consumer confidence. The current study uses knowledge of the processes generating Twitter data to develop and test hypotheses for when social media and survey data might align, and thus when social media processes may reflect survey measures. We expect and find the greatest correspondence when Twitter data were used to predict perceptions of recent societal economic change, rather than the aggregations of pocketbook economic experiences or reports of the state of the economy. In contrast to the concerns many scholars have raised about nonprobability data sources, the results suggest that correspondence between Twitter data and survey data did not depend on how similar Twitter users were to the population. Finally, we find evidence that the correspondences that did emerge were highly variable over time and appeared to be induced in the presence of economic volatility, suggesting that consistent long-scale trends may not be driven by consistent small-scale mechanisms.
Article
Journalists increasingly use social media data to infer and report public opinion by quoting social media posts, identifying trending topics, and reporting general sentiment. In contrast to traditional approaches of inferring public opinion, citizens are often unaware of how their publicly available social media data are being used and how public opinion is constructed using social media analytics. In this exploratory study based on a census-weighted online survey of Canadian adults (N = 1,500), we examine citizens’ perceptions of journalistic use of social media data. We demonstrate that (1) people find it more appropriate for journalists to use aggregate social media data rather than personally identifiable data, (2) people who use more social media are more likely to positively perceive journalistic use of social media data to infer public opinion, and (3) the frequency of political posting is positively related to acceptance of this emerging journalistic practice, which suggests some citizens want to be heard publicly on social media while others do not. We provide recommendations for journalists on the ethical use of social media data and social media platforms on opt-in functionality.
Article
While big data offer exciting opportunities to address questions about social behavior, studies must not abandon traditionally important considerations of social science research such as data representativeness and sampling biases. Many big data studies rely on traces of people’s behavior on social media platforms such as opinions expressed through Twitter posts. How representative are such data? Whose voices are most likely to show up on such sites? Analyzing survey data about a national sample of American adults’ social network site usage, this article examines what user characteristics are associated with the adoption of such sites. Findings suggest that several sociodemographic factors relate to who adopts such sites. Those of higher socioeconomic status are more likely to be on several platforms suggesting that big data derived from social media tend to oversample the views of more privileged people. Additionally, Internet skills are related to using such sites, again showing that opinions visible on these sites do not represent all types of people equally. The article cautions against relying on content from such sites as the sole basis of data to avoid disproportionately ignoring the perspectives of the less privileged. Whether business interests or policy considerations, it is important that decisions that concern the whole population are not based on the results of analyses that favor the opinions of those who are already better off.
Article
Polling problems in recent elections have called into question whether sample surveys can still produce valid data. A new study provides reassurance.
Article
At the intersection of behavioral and institutional studies of policy making lie a series of questions about how elite choices affect mass public opinion. Scholars have considered how judicial decisions—especially US Supreme Court decisions—affect individuals’ support for specific policy positions. These studies yield a series of competing findings. Whereas past research uses opinion surveys to assess how individuals’ opinions are shaped, we believe that modern techniques for analyzing social media provide analytic leverage that traditional approaches do not offer. We present a framework for employing Twitter data to study mass opinion discourse. We find that the Supreme Court’s decisions relating to same-sex marriage in 2013 had significant effects on how the public discussed same-sex marriage and had a polarizing effect on mass opinion. We conclude by connecting these findings and our analyses to larger problems and debates in the area of democratic deliberation and big-data analysis.
Article
We use 23M Tweets related to the EU referendum in the UK to predict the Brexit vote. In particular, we use user-generated labels known as hashtags to build training sets related to the Leave/Remain campaign. Next, we train SVMs in order to classify Tweets. Finally, we compare our results to Internet and telephone polls. This approach not only allows to reduce the time of hand-coding data to create a training set, but also achieves high level of correlations with Internet polls. Our results suggest that Twitter data may be a suitable substitute for Internet polls and may be a useful complement for telephone polls. We also discuss the reach and limitations of this method.
Article
In recent years, journalists, political elites, and the public have used Twitter as an indicator of political trends. Given this usage, what effect do campaign activities have on Twitter discourse? What effect does that discourse have on electoral outcomes? We posit that Twitter can be understood as a tool for and an object of political communication, especially during elections. This study positions Twitter volume as an outcome of other electoral antecedents and then assesses its relevance in election campaigns. Using a data set of more than 3 million tweets about 2014 U.S. Senate candidates from three distinct groups—news media, political actors, and the public—we find that competitiveness and money spent in the race were the main predictors of volume of Twitter discourse, and the impact of competitiveness of the race was stronger for tweets coming from the media when compared to the other groups. Twitter volume did not predict vote share for any of the 35 races studied. Our findings suggest that Twitter is better understood as a tool for political communication, and its usage may be predicted by money spent and race characteristics. As an object, Twitter use has limited power to predict electoral outcomes.
Article
Spatially or temporally dense polling remains both difficult and expensive using existing survey methods. In response, there have been increasing efforts to approximate various survey measures using social media, but most of these approaches remain methodologically flawed. To remedy these flaws, this article combines 1,200 state-level polls during the 2012 presidential campaign with over 100 million state-located political tweets; models the polls as a function of the Twitter text using a new linear regularization feature-selection method; and shows via out-of-sample testing that when properly modeled, the Twitter-based measures track and to some degree predict opinion polls, and can be extended to unpolled states and potentially substate regions and subday timescales. An examination of the most predictive textual features reveals the topics and events associated with opinion shifts, sheds light on more general theories of partisan difference in attention and information processing, and may be of use for real-time campaign strategy.
Chapter
An alternative and possibly more apt title for this chapter might well be ‘1936 and all that’, a veiled reference to Sellars and Yeatman’s humorous history of England which contained a list more fanciful and cliche-bending answers that students had given in examinations. For our purposes, perhaps the most instructive example to be found in that volume relates not to the eponymous date, but the Battle of Bosworth Field in 1492: Noticing suddenly that the Middle Ages were coming to an end, the Barons now made a stupendous effort to revive the old Feudal amenities of Sackage, Carnage, and Wreckage and so stave off the Tudors for a time. They achieved this by a very clever plan, known as the Wars of the Roses. (Sellars and Yeatman 1930)
Article
This article addresses the potential role played by social media analysis in promoting interaction between politicians, bureaucrats, and citizens. We show that in a “Big Data” world, the comments posted online by social media users can profitably be used to extract meaningful information, which can support the action of policymakers along the policy cycle. We analyze Twitter data through the technique of Supervised Aggregated Sentiment Analysis. We develop two case studies related to the “jobs act” labor market reform and the “#labuonascuola” school reform, both formulated and implemented by the Italian Renzi cabinet in 2014–15. Our results demonstrate that social media data can help policymakers to rate the available policy alternatives according to citizens' preferences during the formulation phase of a public policy; can help them to monitor citizens' opinions during the implementation phase; and capture stakeholders' mobilization and de-mobilization processes. We argue that, although social media analysis cannot replace other research methods, it provides a fast and cheap stream of information that can supplement traditional analyses, enhancing responsiveness and institutional learning.
Article
In this article, we examine the relationship between metrics documenting politics-related Twitter activity with election results and trends in opinion polls. Various studies have proposed the possibility of inferring public opinion based on digital trace data collected on Twitter and even the possibility to predict election results based on aggregates of mentions of political actors. Yet, a systematic attempt at a validation of Twitter as an indicator for political support is lacking. In this article, building on social science methodology, we test the validity of the relationship between various Twitter-based metrics of public attention toward politics with election results and opinion polls. All indicators tested in this article suggest caution in the attempt to infer public opinion or predict election results based on Twitter messages. In all tested metrics, indicators based on Twitter mentions of political parties differed strongly from parties’ results in elections or opinion polls. This leads us to question the power of Twitter to infer levels of political support of political actors. Instead, Twitter appears to promise insights into temporal dynamics of public attention toward politics.
Book
Sentiment analysis is the computational study of people's opinions, sentiments, emotions, and attitudes. This fascinating problem is increasingly important in business and society. It offers numerous research challenges but promises insight useful to anyone interested in opinion analysis and social media analysis. This book gives a comprehensive introduction to the topic from a primarily natural-language-processing point of view to help readers understand the underlying structure of the problem and the language constructs that are commonly used to express opinions and sentiments. It covers all core areas of sentiment analysis, includes many emerging themes, such as debate analysis, intention mining, and fake-opinion detection, and presents computational methods to analyze and summarize opinions. It will be a valuable resource for researchers and practitioners in natural language processing, computer science, management sciences, and the social sciences.
Article
Politics and political conflict often occur in the written and spoken word. Scholars have long recognized this, but the massive costs of analyzing even moderately sized collections of texts have hindered their use in political science research. Here lies the promise of automated text analysis: it substantially reduces the costs of analyzing large collections of text. We provide a guide to this exciting new area of research and show how, in many instances, the methods have already obtained part of their promise. But there are pitfalls to using automated methods—they are no substitute for careful thought and close reading and require extensive and problem-specific validation. We survey a wide range of new methods, provide guidance on how to validate the output of the models, and clarify misconceptions and errors in the literature. To conclude, we argue that for automated text methods to become a standard tool for political scientists, methodologists must contribute new methods and new methods of validation.
Article
This article, delivered as the 22 nd Memorial Morris Hansen lecture, argues that the contract houses, typified by Westat, are uniquely situated in the cluster of institutions, practices, and principles that collectively constitute a bridge between scientific evidence on the one hand and public policy on the other. This cluster is defined in The Use of Science as Evidence in Public Policy as a policy enterprise that generates a form of social knowledge on which modern economies, policies, and societies depend (National Research Council 2012). The policy enterprise in the U. S. largely took shape in the first half of the twentieth century, when sample surveys and inferential statistics matured into an information system that provided reliable and timely social knowledge relevant to the nation’s policy choices. In ways described shortly, Westat and other social science organizations that respond to “request for proposals” (RFP) from the government for social data and social analysis came to occupy a unique niche. The larger question addressed is whether the policy enterprise as we know it is prepared for the tsunami beginning to encroach on its territory. Is it going to be swamped by a data tsunami that takes information from very different sources than the familiar census/survey methods?