Article

Evaluating the comprehensiveness of Twitter Search API results: A four step method

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The Twitter Search API (Applications Programming Interface) is a free service that allows software to automatically submit searches to Twitter and retrieve matching tweets. It is widely used to gather tweets for social science and other research, although this is not its main purpose. It does not guarantee to be comprehensive, however, so this article introduces a simple method to check the coverage of its results for narrowly focused topics. An application of the method shows that the results are incomplete, put possibly only due to the filtering out of duplicate, potentially offensive or conversational content.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... To gain insights into the fine-grained complaints or grievances expressed by commuters in Indian cities, we collected transportation-related tweets from the six major Indian cities-Delhi, Mumbai, Bangalore, Kolkata, Hyderabad, and Chennai-from the Twitter microblogging site (presently known as X). For this, we utilized the Twitter Search API (Thelwall 2015) to collect the tweets. We started with a set of generic keywords related to transportation in India, and collected tweets containing one of these keywords and the name of an Indian city. ...
... We utilized the Twitter Search API (Thelwall 2015) to acquire transportation-related tweets from Twitter (which is currently known as X). Since we primarily aim to understand Indian citizens' perceptions and specific complaints of the prevailing transportation services, to gauge a diverse range of opinions and grievances across India's populace, we chose to concentrate on India's six largest metropolitan areas -Delhi, Mumbai, Kolkata, Hyderabad, Bangalore, and Chennai. ...
Article
Full-text available
Due to population growth and rapid urbanization in Indian cities, transportation has evolved as a critical concern affecting a large number of commuters everyday. Hence it is important for the urban planners, policymakers, and transportation authorities of India to know about the different public grievances/concerns regarding transportation. This study aims to uncover valuable information about specific transport-related complaints/grievances in Indian cities from the vast pool of user-generated content on social media platforms such as Twitter. As an initial step, we have explored the broad sentiment of commuters in six Indian metropolitan cities about the existing transportation systems, and created a dataset that broadly classify tweets into negative and positive sentiments. Next, we have identified a set of fine-grained complaints/grievances in these tweets, and thus created the first dataset containing transport-related tweets labelled into various specific complaints/grievances in a multi-label setting. To our knowledge, there is no existing dataset that labels tweets according to specific concerns raised in the posts. We apply several classification models on the dataset, for classifying transportation-related tweets into the specific complaints/grievances. We further conducted a city-wise analysis to better comprehend the specific transport-related complaints prevalent in each Indian city.
... All the search queries performed have relied on the Academic Twitter API v2. Despite the service is quite efficient, recovery for all existing tweets for each query is not 100% guaranteed as "the Search API is focused on relevance and not completeness", and some tweets (mainly spam, duplicate tweets or offensive tweets) may be missing from search results (Thelwall, 2015). Although this circumstance is explicitly pointed out by Twitter API v1 12 and this work uses API v2, it is likely that this same problem would occur. ...
Article
Full-text available
This study attempts to analyze patents as cited/mentioned documents to better understand the interest, dissemination and engagement of these documents in social environments, laying the foundations for social media studies of patents (social Patentometrics). Particularly, this study aims to determine how patents are disseminated on Twitter by analyzing three elements: tweets linking to patents, users linking to patents, and patents linked from Twitter. To do this, all the tweets containing at least one link to a full-text patent available on Google Patents were collected and analyzed, yielding a total of 126,815 tweets (and 129,001 links) to 86,417 patents. The results evidence an increase in the number of linking tweets over the years, presumably due to the creation of a standardized patent URL ID and the integration of Google Patents and Google Scholar, which took place in 2015. The engagement achieved by these tweets is limited (80.2% of tweets did not attract likes) but increasing notably since 2018. Two super-publisher twitter bot accounts (dailypatent and uspatentbot) are responsible for 53.3% of all the linking tweets, while most accounts are sporadic users linking to patent as part of a conversation. The patents most tweeted are, by far, from United States (87.5% of all links to Google Patents), mainly due to the effect of the two super-publishers. The impact of patents in terms of the number of tweets linking to them is unrelated to their year of publication, status, or number of patent citations received, while controversial and media topics might be more determinant factors. However, further research is needed to better understand the topics discussed around patents on Twitter, the users involved, and the metrics attained. Given the increasing number of linking users and linked patents, this study finds Twitter as a relevant source to measure patent-level metrics, shedding light on the impact and interest of patents by the broad public.
... Unfortunately, the algorithm used by the Twitter API to sample tweets is unknown. As studied by Thelwall [25], the search endpoint may not be comprehensive. However, the tweets that were not retrieved by its sampling procedure are more likely to be spam. ...
Preprint
Full-text available
COVID-19 has brought about many changes in social dynamics. Stay-at-home orders and disruptions in school teaching can influence bullying behavior in-person and online, both of which leading to negative outcomes in victims. To study cyberbullying specifically, 1 million tweets containing keywords associated with abuse were collected from the beginning of 2019 to the end of 2021 with the Twitter API search endpoint. A natural language processing model pre-trained on a Twitter corpus generated probabilities for the tweets being offensive and hateful. To overcome limitations of sampling, data was also collected using the count endpoint. The fraction of tweets from a given daily sample marked as abusive is multiplied to the number reported by the count endpoint. Once these adjusted counts are assembled, a Bayesian autoregressive Poisson model allows one to study the mean trend and lag functions of the data and how they vary over time. The results reveal strong weekly and yearly seasonality in hateful speech but with slight differences across years that may be attributed to COVID-19.
... Unfortunately, the algorithm used by the Twitter API to sample tweets is unknown. As studied by Thelwall (2015), the search endpoint may not be comprehensive. However, the tweets that were not retrieved by its sampling procedure are more likely to be spam. ...
Preprint
Full-text available
COVID-19 has brought about many changes in social dynamics. Stay-at-home orders and disruptions in school teaching can influence bullying behavior in-person and online, both of which leading to negative outcomes in victims. To study cyberbullying specifically, 1 million tweets containing keywords associated with abuse were collected from the beginning of 2019 to the end of 2021 with the Twitter API search endpoint. A natural language processing model pre-trained on a Twitter corpus generated probabilities for the tweets being offensive and hateful. To overcome limitations of sampling, data was also collected using the count endpoint. The fraction of tweets from a given daily sample marked as abusive is multiplied to the number reported by the count endpoint. Once these adjusted counts are assembled, a Bayesian autoregressive Poisson model allows one to study the mean trend and lag functions of the data and how they vary over time. The results reveal strong weekly and yearly seasonality in hateful speech but with slight differences across years that may be attributed to COVID-19.
... This is unlikely to be a factor in this focused data set, however, as a tweet referencing a candidate would need to be taken down in less than 15 min to be omitted from the corpus. Given the extended time range of collection, the relatively low volume of tweets over the time of the collection, and considering the study's cutoff time before the midterm results, the Search API is a reliable and representative source of data for the purposes of this study (Thelwall, 2015). ...
Article
In the 2018 U.S. midterm elections, an unprecedented number of American Muslims ran for public office, including the first two Muslim women elected to Congress. This study analyzes the anti-Muslim/anti-immigrant Twitter discourse surrounding Ilhan Omar, one of these two successful candidates. The results identify three categories of accounts that linked Omar to clusters of accounts that shaped the Islamophobia/xenophobic narrative: Influencers, Amplifiers, and Icons. This cadre of accounts played a synergistic and disproportionate role in raising the level of hate speech as a vast network containing a high proportion of apparently inauthentic accounts magnified the messages generated by a handful of provocateurs.
... Tweets were collected between 26 May and 9 December 2012, using Twitter's free advanced programming interface (API) using the "garden-hose" method, which allows for the real-time collection of 1% of all tweets (Thelwall, 2015;Young et al., 2014). We also collected the metadata associated with the tweets including the users' IP address and time the tweet was sent (Young et al., 2014). ...
Article
Full-text available
Crime monitoring tools are needed for public health and law enforcement officials to deploy appropriate resources and develop targeted interventions. Social media, such as Twitter, has been shown to be a feasible tool for monitoring and predicting public health events such as disease outbreaks. Social media might also serve as a feasible tool for crime surveillance. In this study, we collected Twitter data between May and December 2012 and crime data for the years 2012 and 2013 in the United States. We examined the association between crime data and drug-related tweets. We found that tweets from 2012 were strongly associated with county-level crime data in both 2012 and 2013. This study presents preliminary evidence that social media data can be used to help predict future crimes. We discuss how future research can build upon this initial study to further examine the feasibility and effectiveness of this approach.
Article
Full-text available
COVID-19 has brought about many changes in social dynamics. Stay-at-home orders and disruptions in school teaching can influence bullying behavior in-person and online, both of which leading to negative outcomes in victims. To study cyberbullying specifically, 1 million tweets containing keywords associated with abuse were collected from the beginning of 2019 to the end of 2021 with the Twitter API search endpoint. A natural language processing model pre-trained on a Twitter corpus generated probabilities for the tweets being offensive and hateful. To overcome limitations of sampling, data were also collected using the count endpoint. The fraction of tweets from a given daily sample marked as abusive is multiplied to the number reported by the count endpoint. Once these adjusted counts are assembled, a Bayesian autoregressive Poisson model allows one to study the mean trend and lag functions of the data and how they vary over time. The results reveal strong weekly and yearly seasonality in hateful speech but with slight differences across years that may be attributed to COVID-19.
Article
Full-text available
Sentiment analysis is a computer rule based automatic process that has the ability to scrutinize the short text message, user comments, and other textual information and gives the sentiment score on a given subject. The current study is to examine the sentiment analysis of twitter comments of ten university libraries. The ten of universities list was compiled from World university rankings 2019 (Time Higher Education Website). A total of 15850 number of tweets collected between 1st Jan 2013 and 1st September 2019 via Twitter Application Programming Interface (API) for further analysis. The study found overall av. Pos.-av. Neg. was 0.4115. Out 15850 tweets majority of the tweets from The Bodleian Libraries, University of Oxford about 2760 tweets and highest friend followers 76180 found. Significantly, The Bodleian Libraries, University of Oxford was the highest Av. Nos. of positive sentiment (1.7728). However, the lowest Av. Neg. sentiment received by Yale Library about (1.1454). Moreover, the study found that the word "exhibition" (499) times and "archive" (401) times used in total tweets. Likewise, the word "Congratulations" found average positive sentiment (3.0152) mentioned in total tweets. The study recommended that the library can use social networking sites and examine the user comments, feedback, and reviews that, a user had given on different posts. By doing this, the library will be in better position to overcome the problem and make better decisions for future.
Article
Who has power in the construction of economic news in the UK? Are social media reshaping how this power is enabled? We examine the public Twitter interactions between journalists, political elites, and what is arguably the UK’s most important think tank, the Institute for Fiscal Studies (IFS), during the 2015 UK general election campaign. Combining human-coded content analysis and network analysis of Twitter discourse about the IFS during a 38-day period, we explain how and why the authority of this think tank is being translated to social media. We develop a new, social media theory of ‘primary definers’ and show how the political authority on which such roles rest is co-constructed and propagated by professional journalists and political elites. Central to this process is a behaviour we conceptualize and measure: authority signalling. Our findings call into question some of the more sanguine generalizations about social media’s contribution to pluralist democracy. Given the dominance of public service broadcasters in the processes we identify, we argue that, despite the growth of social media, there can be surprising limits on the extent to which contemporary media systems help citizens gain information about the assumptions underlying economic policy.
Conference Paper
Social media provides a rich environment for understanding social connections, interactions and information sharing across many aspects of society. The relative ease of access to social media data through provision of application program interface's (API) by social media companies has led to a significant number of studies that attempt to understand how social media fits into society and how the public uses it for discourse and information sharing. One of the existing gaps in these studies is the lack of extensive description of the data collection and processing methods. These gaps exist as a result of word limits in existing publication venues and a lack of appropriate publication venues to share this type of fundamental research. The following paper provides extensive detail as to how a 52 million corpus of Twitter data on the 2012 Presidential Election in the United States was collected, parsed and analyzed. This level of detail is imperative in studies of social media as small choices in what data to collect can have material effect on the findings. In addition to the description of the methods, the following paper provides a contribution to knowledge in providing basic characteristics of one of the largest research datasets of social media activity compiled to study political discourse.
Article
Full-text available
Social networking services like Twitter have changed the way people engage with traditional broadcast media. But how social is "second screen" activity? The purpose of this study is to determine if patterns of connected viewing (augmenting television consumption with a second screen) and co-viewing (watching television together) are different for traditionally broadcast, "appointment" television shows versus streaming, asynchronous television releases. This study explores this phenomena of "co-connected viewing" - a combination of connected and co-viewing - on Twitter for four programs that were all released within seven days of each other: Parks and Recreation, Downton Abbey, House of Cards, and Unbreakable Kimmy Schmidt. Complete datasets (over 200,000 tweets) from 72 hours' worth of Twitter activity for four television programs, two traditional and two streaming, were collected and analyzed. In terms of co-connected viewing, the study found that despite radically different broadcast models and corresponding shapes in Twitter activity, the ratios of social to non-social tweets were nearly identical. Additionally, the study found that the asynchronous, streaming Netflix shows saw more engagement from active Twitter users. Finally, implications are discussed for viewers, fans, advertisers, and the television industry, as well as directions for future research.
Article
Full-text available
This article analyzes Twitter as a potential alternative source of external links for use in webometric analysis because of its capacity to embed hyperlinks in different tweets. Given the limitations on searching Twitter's public API, we decided to use the Topsy search engine as a source for compiling tweets. To this end, we took a global sample of 200 universities and compiled all the tweets with hyperlinks to any of these institutions. Further link data was obtained from alternative sources (MajesticSEO and OpenSiteExplorer) in order to compare the results. Thereafter, various statistical tests were performed to determine the correlation between the indicators and the ability to predict external links from the collected tweets. The results indicate a high volume of tweets, although they are skewed by the presence and performance of specific universities and countries. The data provided by Topsy correlated significantly with all link indicators, particularly with OpenSiteExplorer (r=0.769). Finally, prediction models do not provide optimum results because of high error rates, which fall slightly in nonlinear models applied to specific environments. We conclude that the use of Twitter (via Topsy) as a source of hyperlinks to universities produces promising results due to its high correlation with link indicators, though limited by policies and culture regarding use and presence in social networks.
Article
Full-text available
In recent years, there has been an increasing attention in the literature on the possibility of analyzing social media as a useful complement to traditional off-line polls to monitor an electoral campaign. Some scholars claim that by doing so, we can also produce a forecast of the result. Relying on a proper methodology for sentiment analysis remains a crucial issue in this respect. In this work, we apply the supervised method proposed by Hopkins and King to analyze the voting intention of Twitter users in the United States (for the 2012 Presidential election) and Italy (for the two rounds of the centre-left 2012 primaries). This methodology presents two crucial advantages compared to traditionally employed alternatives: a better interpretation of the texts and more reliable aggregate results. Our analysis shows a remarkable ability of Twitter to “nowcast” as well as to forecast electoral results.
Article
Full-text available
Social networking sites and other social media have enabled new forms of collaborative communication and participation for users, and created additional value as rich data sets for research. Research based on accessing, mining, and analyzing social media data has risen steadily over the last several years and is increasingly multidisciplinary; researchers from the social sciences, humanities, computer science and other domains have used social media data as the basis of their studies. The broad use of this form of data has implications for how curators address preservation, access and reuse for an audience with divergent disciplinary norms related to privacy, ownership, authenticity and reliability.In this paper, we explore how the characteristics of the Twitter platform, coupled with an ambiguous and evolving understanding of privacy in networked communication, and divergent disciplinary understandings of the resulting data, combine to create complex issues for curators trying to ensure broad-based and ethical reuse of Twitter data. We provide a case study of a specific data set to illustrate how data curators can engage with the topics and questions raised in the paper. While some initial suggestions are offered to librarians and other information professionals who are beginning to receive social media data from researchers, our larger goal is to stimulate discussion and prompt additional research on the curation and preservation of social media data.
Article
Full-text available
Effective crisis management has long relied on both the formal and informal response communities. Social media platforms such as Twitter increase the participation of the informal response community in crisis response. Yet, challenges remain in realizing the formal and informal response communities as a cooperative work system. We demonstrate a supportive technology that recognizes the existing capabilities of the informal response community to identify needs (seeker behavior) and provide resources (supplier behavior), using their own terminology. To facilitate awareness and the articulation of work in the formal response community, we present a technology that can bridge the differences in terminology and understanding of the task between the formal and informal response communities. This technology includes our previous work using domain-independent features of conversation to identify indications of coordination within the informal response community. In addition, it includes a domain-dependent analysis of message content (drawing from the ontology of the formal response community and patterns of language usage concerning the transfer of property) to annotate social media messages. The resulting repository of annotated messages is accessible through our social media analysis tool, Twitris. It allows recipients in the formal response community to sort on resource needs and availability along various dimensions including geography and time. Thus, computation indexes the original social media content and enables complex querying to identify contents, players, and locations. Evaluation of the computed annotations for seeker-supplier behavior with human judgment shows fair to moderate agreement. In addition to the potential benefits to the formal emergency response community regarding awareness of the observations and activities of the informal response community, the analysis serves as a point of reference for evaluating more computationally intensive efforts and characterizing the patterns of language behavior during a crisis.
Article
Full-text available
Background: Twitter is an interactive, real-time media that could prove useful in health care. Tweets from cancer patients could offer insight into the needs of cancer patients. Objective: The objective of this study was to understand cancer patients' social media usage and gain insight into patient needs. Methods: A search was conducted of every publicly available user profile on Twitter in Japan for references to the following: breast cancer, leukemia, colon cancer, rectal cancer, colorectal cancer, uterine cancer, cervical cancer, stomach cancer, lung cancer, and ovarian cancer. We then used an application programming interface and a data mining method to conduct a detailed analysis of the tweets from cancer patients. Results: Twitter user profiles included references to breast cancer (n=313), leukemia (n=158), uterine or cervical cancer (n=134), lung cancer (n=87), colon cancer (n=64), and stomach cancer (n=44). A co-occurrence network is seen for all of these cancers, and each cancer has a unique network conformation. Keywords included words about diagnosis, symptoms, and treatments for almost all cancers. Words related to social activities were extracted for breast cancer. Words related to vaccination and support from public insurance were extracted for uterine or cervical cancer. Conclusions: This study demonstrates that cancer patients share information about their underlying disease, including diagnosis, symptoms, and treatments, via Twitter. This information could prove useful to health care providers.
Conference Paper
Full-text available
Social computational systems emerge in the wild on popular social networking sites like Facebook and Twitter, but there remains confusion about the relationship between social interactions and the technical traces of interaction left behind through use. Twitter interactions and social experience are particularly challenging to make sense of because of the wide range of tools used to access Twitter (text message, website, iPhone, TweetDeck and others), and the emergent set of practices for annotating message context (hashtags, reply to's and direct messaging). Further, Twitter is used as a back channel of communication in a wide range of contexts, ranging from disaster relief to watching television. Our study examines Twitter as a transport protocol that is used differently in different socio-technical contexts, and presents an analysis of how researchers might begin to approach studies of Twitter interactions with a more reflexive stance toward the application programming interfaces (APIs) Twitter provides. We conduct a careful review of existing literature examining socio-technical phenomena on Twitter, revealing a collective inconsistency in the description of data gathering and analysis methods. In this paper, we present a candidate architecture and methodological approach for examining specific parts of the Twittersphere. Our contribution begins a discussion among social media researchers on the topic of how to systematically and consistently make sense of the social phenomena that emerge through Twitter. This work supports the comparative analysis of Twitter studies and the development of social media theories.
Conference Paper
Full-text available
Twitter, a popular microblogging service, has received much attention recently. An important characteristic of Twitter is its real-time nature. For example, when an earthquake occurs, people make many Twitter posts (tweets) related to the earthquake, which enables detection of earthquake occurrence promptly, simply by observing the tweets. As described in this paper, we investigate the real-time inter- action of events such as earthquakes, in Twitter, and pro- pose an algorithm to monitor tweets and to detect a target event. To detect a target event, we devise a classifier of tweets based on features such as the keywords in a tweet, the number of words, and their context. Subsequently, we produce a probabilistic spatiotemporal model for the tar- get event that can find the center and the trajectory of the event location. We consider each Twitter user as a sensor and apply Kalman filtering and particle filtering, which are widely used for location estimation in ubiquitous/pervasive computing. The particle filter works better than other com- pared methods in estimating the centers of earthquakes and the trajectories of typhoons. As an application, we con- struct an earthquake reporting system in Japan. Because of the numerous earthquakes and the large number of Twit- ter users throughout the country, we can detect an earth- quake by monitoring tweets with high probability (96% of earthquakes of Japan Meteorological Agency (JMA) seis- mic intensity scale 3 or more are detected). Our system detects earthquakes promptly and sends e-mails to regis- tered users. Notification is delivered much faster than the announcements that are broadcast by the JMA.
Article
Full-text available
This paper offers a descriptive account of Twitter (a micro-blogging service) across four high profile, mass convergence events—two emergency and two national security. We statistically examine how Twitter is being used surrounding these events, and compare and contrast how that behavior is different from more general Twitter use. Our findings suggest that Twitter messages sent during these types of events contain more displays of information broadcasting and brokerage, and we observe that general Twitter use seems to have evolved over time to offer more of an information-sharing purpose. We also provide preliminary evidence that Twitter users who join during and in apparent relation to a mass convergence or emergency event are more likely to become long-term adopters of the technology.
Article
In April 2010, the U.S. Library of Congress and the popular micro-blogging company Twitter announced that every public tweet, since Twitter's inception in March 2006, will be archived digitally at the Library and made available to researchers. The Library of Congress' planned digital archive of all public tweets holds great promise for the research community, yet, over five years since its announcement, the archive remains unavailable. This paper explores the challenges faced by the Library that have prevented the timely realization of this valuable archive, divided into two categories: challenges involving practice, such as how to organize the tweets, how to provide useful means of retrieval, how to physically store them; and challenges involving policy, such as the creation of access controls to the archive, whether any information should be censored or restricted, and the broader ethical considerations of the very existence of such an archive, especially privacy and user control.
Conference Paper
There is a growing need to make sense of all the raw data available on the Internet, hence, the purpose of this study is to explore the capabilities of data mining algorithms applied to social networks. We propose a system to mine public Twitter data for information relevant to obesity and health as an initial case study. This paper details the findings of our project and critiques the use of social networks for data mining purposes.
Article
The worldwide span of the microblogging service Twitter provides an opportunity to make international comparisons of trending topics of interest, such as news stories. Previous international comparisons of news interests have tended to use surveys and may bypass topics not well covered in the mainstream media. This study uses 9 months of English-language Tweets from the United Kingdom, United States, India, South Africa, New Zealand, and Australia. Based upon the top 50 trending keywords in each country from the 0.5 billion Tweets collected, festivals or religious events are the most common, followed by media events, politics, human interest, and sports. U.S. trending topics have the most interest in the other countries and Indian trending topics the least. Conversely, India is the most interested in other countries’ trending topics and the United States the least. This gives evidence of an international hierarchy of perceived importance or relevance with some issues, such as the international interest in U.S. Thanksgiving celebrations, apparently not being directly driven by the media. This hierarchy echoes, and may be caused by, similar news coverage trends. Although the current imbalanced international news coverage does not seem to be out of step with public news interests, the political implication is that the Twitter-using public reflects, and hence seems to implicitly accept, international imbalances in news media agenda setting rather than combating them. This is an issue for those believing that these imbalances make the media too powerful. © 2012 Wiley Periodicals, Inc.
Article
We continue our conversation on open access ("Information Wants to be Free" XRDS Spring 2013) by taking a closer look at a few recent developments, which highlight some of the conflicting interests fueling the debate over academic publishing.
Article
This article asks if it is possible to use commercial data analysis software and digital by-product data to do critical social science. In response this article introduces social media data aggregator software to a social science audience. The article explores how this particular software can be used to do social research. It uses some specific examples in order to elaborate upon the potential of the software and the type of insights it can be used to generate. The aim of the article is to show how digital by-product data can be used to see the social in alternative ways, it explores how this commercial software might enable us to find patterns amongst 'monumentally detailed data'. As such is responds to Andrew Abbott's as yet unresolved eleven year old reflections on the crucial challenges that face the social sciences in a data rich era.
Article
The microblogging site Twitter generates a constant stream of communication, some of which concerns events of general interest. An analysis of Twitter may, therefore, give insights into why particular events resonate with the population. This article reports a study of a month of English Twitter posts, assessing whether popular events are typically associated with increases in sentiment strength, as seems intuitively likely. Using the top 30 events, determined by a measure of relative increase in (general) term usage, the results give strong evidence that popular events are normally associated with increases in negative sentiment strength and some evidence that peaks of interest in events have stronger positive sentiment than the time before the peak. It seems that many positive events, such as the Oscars, are capable of generating increased negative sentiment in reaction to them. Nevertheless, the surprisingly small average change in sentiment associated with popular events (typically 1% and only 6% for Tiger Woods' confessions) is consistent with events affording posters opportunities to satisfy pre-existing personal goals more often than eliciting instinctive reactions. © 2011 Wiley Periodicals, Inc.
Conference Paper
Twitter is a microblogging website where users read and write millions of short messages on a variety of topics every day. This study uses the context of the German federal election to investigate whether Twitter is used as a forum for political deliberation and whether online messages on Twitter validly mirror offline political sentiment. Using LIWC text analysis software, we conducted a contentanalysis of over 100,000 messages containing a reference to either a political party or a politician. Our results show that Twitter is indeed used extensively for political deliberation. We find that the mere number of messages mentioning a party reflects the election result. Moreover, joint mentions of two parties are in line with real world political ties and coalitions. An analysis of the tweets' political sentiment demonstrates close correspondence to the parties' and politicians' political positions indicating that the content of Twitter messages plausibly reflects the offline political landscape. We discuss the use of microblogging message content as a valid indicator of political sentiment and derive suggestions for further research. Copyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Guide to the Twitter API -Part 3 of 3: An overview of Twitter's Streaming API
GNIP (2011). Guide to the Twitter API -Part 3 of 3: An overview of Twitter's Streaming API. https://blog.gnip.com/tag/gardenhose/
  • N Levine
  • T A Mann
  • S Mannor
Levine, N., Mann, T. A., & Mannor, S. (2015). Actively learning to attract followers on Twitter. arXiv preprint arXiv:1504.04114.
Real-time spatio-temporal analysis of West Nile Virus using Twitter data
  • R Sugumaran
  • J Voss
Sugumaran, R., & Voss, J. (2012). Real-time spatio-temporal analysis of West Nile Virus using Twitter data. In Proceedings of the 3rd International Conference on Computing for Geospatial Research and Applications (p. 39). New York: ACM Press.
  • Cybermetrics
Cybermetrics. Issues Contents: Vol. 18-19 (2015): Paper 1. Evaluating the Comprehensiveness of Twitt...
How Twitter gets in the way of knowledge
  • D Wisdom
Wisdom, D. (2013). How Twitter gets in the way of knowledge. BuzzFeed News. http://www.buzzfeed.com/nostrich/how-twitter-gets-in-the-way-of-research
  • M Zimmer
  • N J Proferes
Zimmer, M., & Proferes, N. J. (2014). A topology of Twitter research: Disciplines, methods, and ethics. Aslib Journal of Information Management, 66(3), 250-261.