Article

The spread of true and false news online

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Lies spread faster than the truth There is worldwide concern over false news and the possibility that it can influence political, economic, and social well-being. To understand how false news spreads, Vosoughi et al. used a data set of rumor cascades on Twitter from 2006 to 2017. About 126,000 rumors were spread by ∼3 million people. False news reached more people than the truth; the top 1% of false news cascades diffused to between 1000 and 100,000 people, whereas the truth rarely diffused to more than 1000 people. Falsehood also diffused faster than the truth. The degree of novelty and the emotional reactions of recipients may be responsible for the differences observed. Science , this issue p. 1146

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Fake news and misinformation have become powerful forces in today's world, influencing elections, public health decisions, and even the way we interact with one another (Lazer et al., 2018). Studies have shown that false news spreads faster and more widely than true news, primarily because it is often designed to be shocking, emotionally compelling, or simply more entertaining than the truth (Vosoughi et al., 2018). A wellknown study by MIT researchers found that false stories on Twitter were 70% more likely to be retweeted than true ones (Friggeri et al., 2014). ...
... Even mainstream platforms like Facebook, Twitter, and YouTube struggle to combat misinformation (Pennycook & Rand, 2019). A single misleading tweet can go viral within minutes, and corrections rarely spread as widely as the original falsehood (Vosoughi et al., 2018). ...
... Misinformation can shape elections, influence public health decisions, and alter market behavior, all while fostering division and mistrust (Friggeri et al., 2014;Swire et al., 2017). When left unchecked, misinformation can lead to widespread panic, civil unrest, and even violence (Vosoughi et al., 2018). A striking example of this is the "Pizzagate" conspiracy theory, which surfaced in 2016. ...
Article
Full-text available
The proliferation of fake news in our time presents a major challenge to public discourse and informed decision-making. As social media platforms expose individuals to vast amounts of information, distinguishing between credible content and misinformation has become increasingly difficult. This paper examines the mechanisms that facilitate the spread of fake news, including clickbait headlines and emotional manipulation, which exploit cognitive biases and contribute to the virality of misleading narratives. Emphasizing the urgent need for enhanced media literacy, the study advocates for educational initiatives that equip individuals with critical thinking skills to assess information sources effectively. By fostering media literacy, societies can strengthen resilience against misinformation, promote trust in reliable sources, and support evidence-based decision-making in an increasingly complex information landscape.
... Researchers have also studied the information propagated on OSNs, using various terms to describe it in their studies. For example, Vosoughi et al. [28] analyse the propagation of stories on Twitter and classify them as ''true'' or ''false'' stories. They clarify in their work that they avoid using the term ''fake news'' as politicians have implemented a political strategy of labelling news sources that do not support their positions as unreliable or fake news, whereas sources that support their positions are labelled reliable or not fake, the term has lost all connection to the actual veracity of the information presented, rendering it meaningless for use in academic classification. ...
... Their work also incorporates a factor called information literacy, which is the ability of individuals to make judgements that distinguish between misinformation and disinformation. In [28] the authors define ''misinformation'' as inaccurate or misleading information like ''fake news'', which involves a deliberate distortion of the truth. In [31], the authors studied the spread of both positive information (e.g., ideas, news, and opinions) and negative information (e.g., rumours and gossip). ...
... Zhou et al. [PS95] review studies that reveal the characteristics of fake news, disinformation, and clickbait. They note that fake news spreads faster and more broadly than the truth, particularly in political contexts [28], [93]. Disinformation is intentionally false information, with psychological and social science theories identifying linguistic cues and patterns that distinguish lies from the truth. ...
Article
Full-text available
Online Social Networks (OSNs) have become a significant research focus across various fields. The increase in their use has prompted numerous studies, particularly on the complex Information Propagation (IP) process, which researchers have approached from different perspectives and lines of investigation. The work presented in this article aims to analyse the state of the art on IP in OSNs, mapping the models, methods, algorithms, tools, and techniques developed in this domain. In particular, we have conducted a Systematic Mapping Study (SMS). To our knowledge, this is the first study to address this issue. The SMS collected 424 studies and analysed 175 primary studies, and the results reveal that most studies are model proposals, the most researched topic is Influence Maximisation (IM), and Twitter (now X) is the most commonly used resource in experiments. Also, the SMS reveals that there is no formal classification of the terms to refer to propagated information. In addition, we also found several proposals to mitigate or control IP. However, there is no common methodological framework to reduce IP. To conclude the study, we propose groups of features/attributes of users during IP and a propagated information classification. This research provides a general and organised overview for the scientific community regarding studies on IP in OSNs.
... The development of artificial intelligence technology is bringing about fundamental changes in data generation methods and the information environment. In particular, the recent rapid spread of AI-generated data deserves attention as it entails the possibility of causing qualitative changes in information as well as various social problems (Pariser, 2011;Vosoughi et al., 2018). Table 4 summarizes the key trends and implications of AI's impact on the information ecosystem. ...
... The proliferation of AI-generated data has far-reaching implications for the information ecosystem and society at large. One major concern is the potential for AIgenerated content to spread misinformation and disinformation (Vosoughi et al., 2018). As AI models become more sophisticated in generating realistic text and images, it becomes increasingly difficult to distinguish between authentic and synthetic content. ...
Article
Full-text available
The rapid advancement of artificial intelligence (AI) technology is profoundly transforming the information ecosystem, reshaping the ways in which information is produced, distributed, and consumed. This study explores the impact of AI on the information environment, examining the challenges and opportunities for sustainable development in the age of AI. The research is motivated by the need to address the growing concerns about the reliability and sustainability of the information ecosystem in the face of AI-driven changes. Through a comprehensive analysis of the current AI landscape, including a review of existing literature and case studies, the study diagnoses the social implications of AI-driven changes in information ecosystems. The findings reveal a complex interplay between technological innovation and social responsibility, highlighting the need for collaborative governance strategies to navigate the tensions between the benefits and risks of AI. The study contributes to the growing discourse on AI governance by proposing a multi-stakeholder framework that emphasizes the importance of inclusive participation, transparency, and accountability in shaping the future of information. The research offers actionable insights for policymakers, industry leaders, and civil society organizations seeking to foster a trustworthy and inclusive information environment in the era of AI, while harnessing the potential of AI-driven innovations for sustainable development.
... Bots, or automated accounts, exploit online ecosystems to disseminate false content. Despite this, research shows that humans are still major propagators of misinformation (Schlette et al., 2022;Rogers, 2020;Vosoughi et al., 2018). Identifying bots is essential, but humans' inability to distinguish them from real accounts leads to the inadvertent spreading of misinformation (Torabi Asr and Taboada, 2019). ...
... these findings suggest Class 1 misinformation may stem from (or emphasize) uncertain, difficult-to-trace sources. The literature review highlights that humans' inability to distinguish bots from real accounts leads to the inadvertent spreading of misinformation (Torabi Asr and Taboada, 2019; Schlette et al., 2022;Rogers, 2020;Vosoughi et al., 2018). The results highlight the importance of engagement metrics, timestamps, and interaction patterns in distinguishing between malicious and non-malicious misinformation actors. ...
Article
Full-text available
Introduction: Telegram’s privacy-focused architecture has made it a fertile ground for the spread of misinformation, yet its closed nature poses challenges for researchers. This study addresses the methodological gap in capturing and analysing misinformation on Telegram, with a particular focus on the anti-vaccination community. Methods: The research was conducted in three phases: (1) a structured review of literature on misinformation dissemination via Telegram, (2) development of a conceptual framework incorporating features of message creators, message content, intended targets and broader social context, and (3) application of this framework to anti-vaccination Telegram channels using latent profile analysis (LPA). A dataset comprising 7,550 messages from 151 Telegram channels was manually annotated and analysed. Results: LPA identified distinct profiles among the channels. Malicious and non-malicious channels showed significant differences in their communication patterns, particularly in the use of crisis framing, discursive manipulation, and thematic orientation. T-tests confirmed these distinctions. Discussion: The findings highlight Telegram’s unique dynamics in misinformation spread and support the utility of the proposed framework in isolating harmful content. The study underscores the need for tailored analytical strategies for platforms with non-standard affordances and suggests that content-based profiling may assist in proactive moderation.
... This situation negatively affects social cohesion and can even disrupt social peace (Guess et al., 2020). By emphasising and polarising social differences, fake news can further marginalise the disadvantaged groups (Vosoughi et al., 2018). ...
... However, participants express that even when claims in fake news are later proven untrue, this correction often goes unnoticed. Numerous studies affirm these concerns in the post-truth era, such as findings showing that fake news spreads much faster on platforms like Twitter compared to accurate information (Vosoughi et al., 2018). ...
Article
As a result of the conflict in Syria in 2011, approximately 3.5 million Syrians migrated to Turkey. A significant number of news articles about Syrian refugees have been produced in traditional media, and various types of content have been created on social media. However, some of these news stories and posts have been based on inaccurate information. This research aims to examine how Syrian refugees interpret the fake news spread about them and how such content affects their daily lives and their relationships with Turkish society. To achieve this objective, focus group discussions were conducted with Syrian refugees working in civil society centres. Various questions were posed to the Syrian refugees about the impact of the spread of fake news on their daily lives, how such news affects their communication processes within society, and what solutions they propose. During the focus group discussion, media texts were distributed which contained fake news or misinformation. According to the study's findings, Syrian refugees frequently encounter fake news in their daily lives, and these reports have negative effects on their daily lives and their integration into the social life of which they are part.
... Previous findings showed that misinformation had the tendency to avoid television, newspaper like established and traditional media while spreading. However, in most cases information was acquired mainly through digital media, including the internet and social media [58][59][60]. The digital social networks rather than the newspaper, radio, and television like traditional media systems, has been considered a malicious medium for spreading mis-and disinformation [58,61,62]. ...
... However, in most cases information was acquired mainly through digital media, including the internet and social media [58][59][60]. The digital social networks rather than the newspaper, radio, and television like traditional media systems, has been considered a malicious medium for spreading mis-and disinformation [58,61,62]. ...
Article
Full-text available
Myths, misinformation, facts like posts spread by social media during COVID-19 pandemic had an enormous effect on psychological health. This study aimed to investigate social media based COVID-19's posts and the psychological health status of participants. A cross-sectional, online survey-based study was conducted in between April to October 2021 using a structured and semi-structured questionnaire, predominantly involving 1200 active social network users in Bangladesh. Depression, anxiety, and stress were assessed using the Depression, Anxiety, and Stress Scale (DASS-21), while the Insomnia Severity Index (ISI) measured insomnia severity for selected participants. Internal reliabilities were calculated with Cronbach's alpha coefficients (cut-off point 0.70). Unrelated multivariate logistic regression explored correlations among outcome errors, with the model assessing the impact of selected independent variables on mental health. The findings demonstrated that 27.8% individuals spread facts whereas 7.4% spread myths and misinformation about COVID-19 on social networks. Furthermore, 28.1% and 36.7% shared obstinate and concerning posts respectively. The prevalence of depression, anxiety and stress symptoms, ranging from mild to extremely severe, were 43.9%, 30.9%, and 23.8% respectively. However, 2.8% had severe level of insomnia. Facts, myths, tour attending, and no mask group photos were significantly associated with anxiety, and less likelihood of experiencing anxiety. Interestingly, circulating such activities on social networks had no significant association with depression, stress, or insomnia. The spread of misinformation on social media undermines any efforts to contain COVID-19 infection. The findings hugely recommend of using fact checking facilities and adaptation to the pandemic situations to maintain lower prevalence of depression, anxiety , stress and insomnia. PLOS Mental Health | https://doi.org/10.1371/journal.pmen.
... Therefore, individuals may be drawn to their allure. A study on misinformation (Vosoughi et al., 2018) shows that false news spreads more rapidly online than factual information. The study reveals that, despite network and individual factors supporting the truth, people are more likely to retweet false information, thus causing it to spread faster. ...
Article
Full-text available
ARTICLE INFO The teaching profession in Türkiye has been assigned significant responsibilities, accompanied by a historical background. However, the profession's own challenges and its frequent presence in certain social debates create an ambiguous situation regarding its prestige. This study presents a perspective on this issue by examining the collective construction of the teaching identity on Twitter as a discursive tool and how it is positioned through political, religious, secular, and national identities. In parallel, the use of history as a tool for legitimization is evaluated in a separate section. To make the collective construction of the teaching profession more comprehensible, the metaphorical meanings attributed to teaching have been identified and analyzed. Designed as a "case study," this research employs the "document analysis" method as a data collection tool. In this context, 42,403 tweets posted on November 24th, Teachers' Day, in 2020 and 2021 were downloaded using Maxqda 12. The sub-problems of the study were determined based on the most frequently repeated meaningful words within the downloaded tweets. The data were analyzed using content and descriptive analysis methods. To identify the discourses subjected to content and descriptive analysis, a set of key terms was utilized. The findings indicate that the metaphorical meanings attributed to teaching are overwhelmingly positive. In the context of collective identities, teaching is primarily positioned through political identity as a claim for rights. It has been observed that religious and secular identities exist in a mild tension, while national identity maintains a balance between the two. Additionally, history has been used as a tool for legitimization, albeit in a limited manner, through historical periods and figures. Within this framework, it can be argued that the teaching profession in Türkiye has been significantly instrumentalized.
... Misinformation isn't new, but the internet and social media have magnified it. They discovered that false news travels faster than the truth on social media platforms, and misinformation is 70% more likely to be retweeted than true information [4]. This is due to: 1. Engagement Algorithms − Social media platforms reward content that produces high engagement (comments, likes, shares, etc.), leading to the amplification of emotionally charged and sensationalist content relative to factual reporting of news and events [5] [8]. ...
Article
Full-text available
The spread of misinformation — including disinformation — has emerged as a sneaky problem as social media proliferated, making it increasingly difficult to find the line between fact and fiction. It has also attracted attention on a large scale in politics, public health, social trust, etc. While only verified media agencies can write and distribute on traditional journalism platforms, on-time social media allows all to publish and spread such information, which certainly lifted the chances of misinformation (Lazer et al., 2018) [1]. The emergence of artificial intelligence (AI) has brought new tools to the table in combatting misinformation. It is this processing of immense amounts of data, detection of false claims, and evaluation of information at a speed and scale unimaginable to human fact-checkers or similar professionals that separates AI-based fact-checking systems from traditional approaches (Zhou and Zafarani, 2020) [2]. But AI has its drawbacks, too. Challenges such as bias in AI training data, inability to understand context, and the need for combining human supervision to ensure accuracy, as well as a constant inability to adapt to sources of misinformation that are constantly being newly crafted (Graves, 2018) [3]. In this essay, we explore how AI is being used in fact-checking, the possible advantages and disadvantages of using AI in this domain, and the future of AI in the fight against misinformation.
... In their study to examine the spread of misinformation, Vosoughi et al. (2018) tracked 126,000 separate posts on Twitter covering different topics between 2006 and 2017. They found that the posts were tweeted and retweeted more than 4.5 million times by 3 million people. ...
Article
Full-text available
With the transition from traditional media to new media, discussions on disinformation have increased in recent years due to intentionally or unintentionally shared false information and the issue of combating disinformation have become major items on the agenda. The integration of people with new media has led to the articulation of the concept of creating value together, transforming media from being an object to be an consumed organism that can be produced, shared, transformed and constantly renewed by everyone. By using social media tools, people have found the opportunity to share their thoughts, ideas and experiences worldwide at any moment, but at the same time, they have started to be exposed to thousands of messages whose accuracy is not clear. They have become unable to escape being victims of a bad effect due to the manipulation and black propaganda created by disinformation. The aim of this study is to contribute to the scientific literature on the manipulation and perception created by disinformation, which is an important problem of the age, by scanning the literature on the causes and consequences of false information spread through social media tools in the new media age. For this purpose, a compilation study has been presented by scanning the terms “disinformation” and “disinformation on social media”.
... People rely on social media platforms to access news and opinions (Soroya et al., 2021;Swire-Thompson and Lazer, 2020). However, social media also facilitates the spread of misinformation (Vosoughi et al., 2018). The use of social media for health-related discussions and information sharing has witnessed a substantial surge compared with other periods (Verma, 2021). ...
Preprint
Full-text available
This study investigates the relationship between social media fatigue and the sharing and credibility of health misinformation. It further examines how social media fatigue influences individuals' discernment ability: the capacity to differentiate between misinformation and real information. Participants (n = 506) rated the credibility and sharing intentions of real and fake health-related posts on social media platforms. Discernment was assessed using both mean rating differences and receiver operating characteristic curve approaches. The results indicate that social media fatigue is positively associated with the credibility and sharing of health misinformation. However, it also increases the credibility and sharing of real health information. Social media fatigue negatively impacts individuals’ ability to discern misinformation from real information but does not significantly influence the quality of their sharing decisions. This study advances the understanding of social media fatigue by integrating its effects on both misinformation and real information. It highlights the need to assess discernment comprehensively, using rigorous evaluation metrics, and calls attention to the broader implications of social media fatigue on public health communication in the digital age.
... Specifically, this has led to the emergence of users who post inappropriate content driven by the mindset that any content is acceptable as long as it generates their profit. For example, there has been an increase in posts that prioritize impact over truth, such as fake news [5], [6], as well as content deliberately designed to incite online flaming [7]. The widespread dissemination of inappropriate content on social media has significant negative impacts on society. ...
Article
Full-text available
The expansion of the attention economy has led to the growing issue of inappropriate content being posted by profit-driven users. Previous countermeasures against inappropriate content have relied on moderation, which raises ethical concerns, or information diffusion control, which requires considering larger scale networks, including general users. This study proposes an imitation strategy as an intervention method that does not rely on moderation and focuses on a relatively smaller scale competitive network of information disseminators rather than the entire social network. The imitation strategy is a novel approach that utilizes increased competition among information disseminators through imitation to reduce attention to inappropriate content. Through theoretical analysis and numerical simulations, I demonstrate that the imitation strategy is more effective when nodes with higher eigenvector centrality are selected as targets and nodes with lower eigenvector centrality are chosen as imitators.
... The advent of social media platforms has amplified the spread of misinformation by enabling rapid and widespread dissemination. Vosoughi et al. (2018) analyzed the diffusion of true and false news stories on Twitter, finding that falsehoods spread significantly faster and broader than accurate information, emphasizing the role of online networks in shaping information dynamics. Misinformation not only undermines public trust in institutions but also poses threats to public health, as evidenced by the proliferation of false claims and conspiracy theories during the COVID-19 pandemic. ...
Article
Full-text available
This study examines the relationship between user attitudes and continuance intention toward Unified Payment Interface (UPI) payment apps in India, with misinformation as a moderating variable. Data were collected from 404 users via an online questionnaire using validated scales, and analysed through Partial Least Squares Structural Equation Modelling (PLS-SEM). Thematic analysis was used to support qualitative insights. Results demonstrate a strong positive influence of user attitudes on continuance intention, driven by perceived benefits such as convenience, security, and efficiency. Notably, misinformation unexpectedly moderated this relationship positively, indicating that users with strong attitudes remain resilient to fake news, prioritizing UPI's functional value. These findings have practical implications for policymakers, fintech firms, and digital platforms aiming to strengthen trust and user engagement. The study contributes to the growing literature on digital payments by highlighting the overlooked role of misinformation in user behavior and providing strategies to foster long-term adoption of UPI.
... Algorithms prioritize engagement, often amplifying sensational or polarizing content at the expense of nuanced and balanced discussion (Tufekci, 2015). Furthermore, the low barriers to content creation enable the proliferation of unverified information, which, coupled with echo chambers, exacerbates misinformation and ideological entrenchment (Vosoughi et al., 2018). These factors are particularly relevant to Generation Z, who are the most digitally immersed demographic, often consuming and interacting with information almost exclusively online. ...
Article
Full-text available
In the digital age, social media has become a pivotal platform for public discourse, particularly among Generation Z. This demographic, known for its digital proficiency, utilizes platforms like Instagram, TikTok, and Twitter to engage in activism, share opinions, and build communities. However, these platforms also present challenges, such as the spread of misinformation, the amplification of polarization, and algorithm-driven content curation. This study aims to analyze the trends, challenges, and implications of social media's role in shaping public discourse among Generation Z. Using a qualitative phenomenological approach, the research collected data through semi-structured interviews and content analysis. Thematic analysis revealed that 35% of social media content is polarizing, often creating ideological echo chambers, while 30% consists of misinformation, undermining trust and informed decision-making. Conversely, activism-related content, accounting for 25% of analyzed posts, highlights the potential for social media to foster civic engagement. The study concludes that while social media empowers Generation Z to participate in societal conversations, it also perpetuates significant challenges. The findings underscore the urgency of digital literacy initiatives and ethical algorithmic reforms to mitigate these issues. These interventions are critical for fostering informed, inclusive, and constructive public discourse in the digital era.
... While offering numerous advantages, social media also presents significant challenges, most notably the rapid and widespread dissemination of false or misleading information, often referred to as an 'infodemic'. False news tends to spread significantly faster on social media compared to real news, largely due to its novelty factor [10]. This misinformation can lead to incorrect interpretations of health information, increased hesitancy toward vaccinations, delays in seeking appropriate healthcare, and even significant mental distress among the public [11]. ...
... The statements of the people surveyed correspond with the most important conclusions drawn from other, primarily broader studies carried out on larger samples, which focus on the typical ways in which recipients of new information react to the false content it contains (Wineburg & McGrew, 2019). According to these studies, they tend to pass false information further online (e.g., using instant messaging), especially when it has certain formal characteristics: it is understandable, features an accessible form, and is emotionally engaging: shocking, strange or surprising (Vosoughi et al., 2018). Unfortunately, they confirm the disturbing conclusion (Guess et al., 2018) that it is easier to convince people of false information by taking advantage of the so-called "disinformation ecosystem" on the web, in which different messages confirm, complement and reinforce each other. ...
Article
Falsification and manipulation of information, using it for image, material or political gain, is a significant phenomenon of contemporary social communication, and no doubt, its scale and significance have made fake news the subject of numerous studies. The purpose of this article is to analyse the attitudes of adult Poles toward fake news, based on the results of a qualitative study conducted as part of the national Infostrateg programme. The study was designed to identify respondents' knowledge and attitudes about fake news, their awareness of the dangers of information manipulation and how they deal with disinformation. A semi-structured individual interview method was used, which made it possible to capture subtle aspects of the respondents' experiences. Data analysis was carried out according to a semi-inductive model, using open coding and comparative analysis. Sampling was based on the criterion of maximum variation, which made it possible to capture a variety of perspectives on fake news. The results indicate that fake news is perceived as an integral part of the modern infosphere, and its presence is widely accepted, although it evokes distrust and caution. Respondents consider them a tool of social disintegration, manipulation of worldviews and network marketing. They show negative emotions toward the phenomenon, while declaring high resistance to information manipulation. The meaning attributed to fake news is reduced to four coherent categories: FN as the creation of a falsified image of reality; as a tool of social disintegration; as a tool for changing or strengthening worldviews; and as a tool of network marketing.
... Fact-checking has emerged as a critical task in the era of social media, where information spreads rapidly [2,7,38] and misinformation can have far-reaching consequences [3,6,19]. Automated fact-checking systems have gained traction as scalable solutions, yet they often grapple with challenges such as handling diverse evidence sources, integrating multimodal data, and presenting comprehensive narratives. ...
Preprint
We propose CRAVE (Cluster-based Retrieval Augmented Verification with Explanation); a novel framework that integrates retrieval-augmented Large Language Models (LLMs) with clustering techniques to address fact-checking challenges on social media. CRAVE automatically retrieves multimodal evidence from diverse, often contradictory, sources. Evidence is clustered into coherent narratives, and evaluated via an LLM-based judge to deliver fact-checking verdicts explained by evidence summaries. By synthesizing evidence from both text and image modalities and incorporating agent-based refinement, CRAVE ensures consistency and diversity in evidence representation. Comprehensive experiments demonstrate CRAVE's efficacy in retrieval precision, clustering quality, and judgment accuracy, showcasing its potential as a robust decision-support tool for fact-checkers.
... However, the same mechanisms that allow for the rapid spread of information can also propagate misinformation and fake news, distorting public perception. Vosoughi, Roy and Aral (2018) found that false news stories spread significantly faster and more broadly on Twitter than true stories, particularly in the domains of politics, urban legends, and science. This phenomenon is driven by the emotional and novel nature of false information, which captures user attention and encourages sharing. ...
Article
Full-text available
This study provides an in-depth examination of the influence of social media on Benin City residents' perception of the 2024 Okuama bloodbath in Delta State, Nigeria. The objectives of this study were to establish the extent of awareness of Benin residents about the Okuama bloodbath. To ascertain the channels through which they were exposed to information about the event, and evaluate how social media influenced their perception of the incident. The researchers employed a survey research design, using a questionnaire as the data collection instrument. A total of 384 respondents were sampled from Benin City, with a response rate of 96.6% (n=371). The respondents comprised individuals aged 18-65, residing in Benin City, and who had access to social media platforms. Data analysis revealed that a substantial majority of respondents (65%) reported a high or very high level of awareness about the Okuama bloodbath. This finding suggests that the event received significant attention from the public, and that Benin City residents were well-informed about the incident. Social media was found to be the primary source of information for over half of the respondents (55%), with WhatsApp (30%) and Facebook (25%) being the most frequently used social media platforms for exposure to information about the Okuama bloodbath. The researchers found that social media played a significant role in shaping public perception, with a majority of respondents (75%) believing that social media effectively raised awareness about the Okuama bloodbath. Social media was seen as instrumental in mobilizing support and aid for the affected community, with 85% of respondents agreeing to its positive impact. However, the spread of misinformation on social media was also identified as a major concern, with 80% of respondents acknowledging its exacerbating effect on the conflict. The study's findings have significant implications for crisis communication, highlighting the importance of social media in shaping public 21 perception during crises. The researchers recommend enhancing the use of official and verified social media channels, implementing media literacy campaigns, and collaborating with social media platforms to flag and remove harmful content. Through adopting these strategies, stakeholders can harness the power of social media to promote constructive engagement, mitigate the spread of misinformation, and facilitate effective crisis communication. Furthermore, it was recommended that social media platforms can play a critical role in promoting peace and stability during crises through providing a platform for dialogue and information dissemination on social media, which can help to reduce tensions, promote understanding, and facilitate conflict resolution. However, this requires responsible use of social media, and stakeholders must be aware of the potential risks and challenges associated with social media use during crises.
... Thus, inappropriate consumption of entertainment content on social media could be seen as a threat to personality development. The consumption of fake news or aggressive content can be extremely harmful for young people, as it can distort the perception of reality and form habits that are not committed to journalistic or informational rigor and the exercise of citizenship (Marchi, 2012;Vosoughi et al., 2018). ...
Article
Full-text available
Introduction: Social media are the most popular among young people: they identify with the content and feel they are part of a collective. We analyze the content of the main influencers in Spain and Chile to find out: 1) what they talk about and identify whether it is informational, educational or entertainment content; 2) determine its quality and whether it eventually leads to misinformation and tends to trivialization; and 3) reflect on the quality of the content and how it can affect the configuration of young people's media diet. Methodology: Twelve accounts of influencers on Instagram, TikTok and YouTube were analyzed through 439 contents. A comparative content analysis combining qualitative and quantitative methods was proposed. Results: Influencers talk about a wide variety of topics, but they prioritize their personal lives from an entertainment perspective. The poor quality of the content is noted, which tends to trivialization. Discussion: Although the study on the impact of social media on the mental health of young people is extensive, less research focuses on analyzing the content of influencers and how they influence their followers. Conclusions: Identifying this trivialization of content can contribute to the development of public policies and training programs in media literacy and encourage the regulation and self-regulation of content on social media, due to the impact on the mental health of young people, who are building their identity.
... The NRC Emotion Lexicon was employed for the sentiment and emotion analysis (Mohammad & Turney, 2013), which contains 14,182 words of eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and three sentiments (positive, neutral, and negative). This lexicon has been used in various sentiment analyses and emotion detection in public health-related topics (Mostafa, 2020;Vosoughi et al., 2018;Zhao et al., 2022). Descriptions of how various "appeals" are identified in the Facebook post with lexicons from NRC sentiments and emotions were provided in Supplementary Materials: Table S1. ...
Article
Full-text available
Effective communication about sustainable healthy diets (SHDs) is a key step to facilitate sustainable transformation of the urban food systems in the context of rapid urbanization and climate change crisis. Based on the theory of agenda setting in social media, this study aimed to answer (RQ1) Who are the key urban stakeholders that communicated SHDs and what is their agenda? (RQ2) What strategies did they use and the effectiveness with user engagement? And (RQ3) How did SHDs promotion connect different stakeholders? 38,004 Facebook posts from 7938 public pages mentioning SHDs across four regions from 2019 to 2022 were retrieved for empirically evaluating their agenda setting. For RQ1, we identified 11 categories of stakeholders and 10 major topics. The topic for supporting local food system development, which aligned with key developmental strategies of sustainable urban food systems, was most salient. For RQ2, we found that the emotional appeals of anticipation, trust, and fear received greater user engagement. For RQ3, the network analysis highlights that participation by local government and non-government organizations leads to better network connections. This study provides insights into the sustainable transformation of urban food systems and effective communication strategies for promoting civil engagement in the transformation processes
... The dataset was collected via the EFSCN and IFCN-certified fact-checking organisation TjekDet's [CheckIt] website, where the fact-checkers exemplify claims rated as "entirely or partly false" through direct links to Facebook. Factchecking agencies are a common data source in studies of misinformation due to the convenience of an already annotated sample (López-García et al., 2021;Song et al., 2021;Vosoughi et al., 2018). The data were collected by scraping all external links with the word "Facebook" from the "entirely or partly false" web page. ...
Article
Full-text available
Social media facilitate a competition for users’ limited attention by bringing various content together, from health advice to entertainment, and from updates from loved ones to misinformation. Especially misinformation has raised societal concern. We evaluated the influence of visual material and cognitive factors of attraction, specifically valenced sentiment, threat-related, intergroup-related, and social information, on engagement scores (i.e., shares, comments, and reactions). We analysed 356 misleading Danish Facebook posts sampled through the fact-checking association TjekDet’s “entirely or partly false” web page by fitting a Bayesian zero-inflated negative binomial regression model. The study showed that videos and images were exceptionally strong predictors of engagement, especially shares. Positivity, negativity, and intergroup-related information also increased engagement, but social information and threat-related information reduced it. Our findings suggest that in a highly competitive online environment, some content biases are stronger than others. Finally, we discuss the potential moderators of their effect such as the users’ reputation management strategies.
... The impact of Big Tech monopolies on the spread of misinformation is particularly alarming. A MIT study by Vosoughi et al. (2018) found that false news spreads six times faster than factual information on Twitter, highlighting the role of platform algorithms in amplifying sensationalist and politically charged content. This distortion of public discourse is further compounded by "echo chambers" and "filter bubbles", where users are algorithmically siloed into ideological groups, reducing exposure to diverse perspectives and reinforcing political polarization (Pariser, 2011). ...
Article
The emergence of digital capitalism has transformed economic, political, and social structures, shifting power from traditional industries to Big Tech monopolies that dominate global information flows. Unlike classical capitalism, which revolved around physical production and labor exploitation, digital capitalism thrives on data extraction, algorithmic governance, and predictive analytics. This transformation has led to the rise of the influence industry, a system where corporations, political actors, and state entities manipulate public opinion through microtargeted advertising, AI-driven misinformation, and algorithmic biases. As companies like Google, Meta (Facebook), and Amazon consolidate power, they exert unprecedented control over digital communication, influencing political discourse, electoral outcomes, and media narratives. This study critically examines the transition from classical to digital capitalism, analyzing how Influence Industry and Big Tech’s exacerbates economic inequality, distorts democratic processes, and facilitates the spread of misinformation. The research highlights key issues, including the Facebook–Cambridge Analytica scandal, AI-driven propaganda in geopolitical conflicts, and algorithmic amplification of extremist content. These developments underscore the urgent need for regulatory interventions to mitigate the risks posed by digital monopolies.
... Published online by Cambridge University Press Accepted manuscript addition, it is important to understand sharing practices as Twitter posts are also subject to likes and retweets. Previous research has investigated emotion as a motivator for retweeting news (20) , but to our knowledge, information quality, and whether quality is a predictor of engagement, has not been investigated. Therefore, in the context of widespread sharing of misinformation, it is important to understand the quality of the information that has the potential to be widely shared and how this influences the debate in question (21) . ...
Article
Full-text available
Objective:To use the validated Online Quality Assessment Tool (OQAT) to assess the quality of online nutrition information. Setting:The social networking platform formerly known as Twitter (now X). Design:Utilising the Twitter search application programming interface (API; v1.1), all tweets that included the word ‘nutrition’, along with associated metadata, were collected on seven randomly selected days in 2021. Tweets were screened, those without a URL were removed and the remainder grouped on retweet status. Articles (shared via URL) were assessed using the OQAT, and quality levels assigned (low, satisfactory, high). Mean differences of retweeted and non-retweeted data were assessed by Mann-Whitney U test. The Cochran-Mantel-Haenszel test was used to compare information quality by source. Results:In total, 10,573 URLs were collected from 18,230 tweets. After screening for relevance, 1,005 articles were assessed (9,568 were out of scope) sourced from: professional-blogs (n=354), news-outlets (n=213), companies (n=166), personal-blogs (n=120), NGOs (n=60), magazines (n=55), universities (n=19), government (n=18). Rasch measures indicated the quality levels; 0-3.48, poor, 3.49-6.3, satisfactory and, 6.4-10, high quality. Personal and company-authored blogs were more likely to rank as poor quality. There was a significant difference in quality of retweeted (n=267, sum of rank, 461.6) and non-retweeted articles (n=738, sum of rank, 518.0), U = 87475, p=0.006, but no significant effect of information source on quality. Conclusions:Lower-quality nutrition articles were more likely to be retweeted. Caution is required when using or sharing articles, particularly from companies and personal blogs, which tended to be lower-quality sources of nutritional information.
... Additionally, political disinformation has one of the highest virality rates (Aral, 2021). It should be noted that false content travels up to 70% faster than true information (Vosoughi et al., 2018). ...
Article
Full-text available
Electoral campaigns are one of the key moments of democracy. In recent times, the circulation of disinformation has increased during these periods. This phenomenon has serious consequences for democratic health since it can alter the behaviour and decisions of voters. This research aims to analyse the features of this phenomenon during the 2024 European Parliament elections in a comparative way. The applied methodology is based on quantitative content analysis. The sample ( N = 278) comprises false information verified by 52 European fact-checking agencies about the campaign for the European elections in 20 EU countries. The analysis model includes variables such as time-period, country, propagator platform, topic, and the type of disinformation. The results show that the life cycle of electoral disinformation goes beyond the closing of the polls assuming a permanent nature. In addition, national environments condition the profiles of this question, which is more intense in Southern and Eastern Europe. Furthermore, although multiple channels are involved, digital platforms with weak ties are predominant in disseminating hoaxes. Finally, migration and electoral integrity are the predominant topics. This favours the circulation of an issue central to the far-right agenda and aims to discredit elections and their mechanisms to undermine democracy. These findings establish the profiles of this problem and generate knowledge to design public policies that combat electoral false content more effectively.
... Therefore broadcast media often serve as one of the primary sources of news for many individuals (Pew Research Center, 2020). In recent years, the influence of digital platforms has led to an increase in online misinformation and the spread of fake news (Vosoughi, Roy & Aral, 2018). Fake news can permeate broadcast media through the production and dissemination of deceptive content by certain media organisations or through the repetition of false information by journalists (Sundar, Oeldorf-Hirsch & Xu, 2017;Okoro, & Nwafor, 2013). ...
Article
Full-text available
Background: The deployment of the Bimodal Voter Accreditation System (BVAS) in the 2023 Nigerian presidential election aimed to enhance the integrity of the electoral process. However, public perception of the level of success of the gadget depends on their knowledge and understanding of how the system works. Objective: This study examined the perception of broadcast media reportage on the use of BVAS during the 2023 Nigerian presidential election. Method: A quantitative research design was employed, and data were collected through a questionnaire administered to 384 respondents in Edo State, Nigeria. Results: The findings revealed that broadcast media played a significant role in informing the public about BVAS, with 77.3% of respondents accessing information about BVAS through television. The study also found that respondents had a positive perception of broadcast media reports on BVAS. Conclusion: The study concludes that broadcast media reportage was effective in shaping public perception and understanding of BVAS during the 2023 Nigerian presidential election. Unique Contribution:: This study provides new insights into the role of broadcast media in promoting public understanding and acceptance of electoral technologies in Nigeria. Key Recommendations: It is recommended that the Independent National Electoral Commission (INEC) should collaborate with broadcast media organizations to amplify public awareness and education on electoral technologies. Additionally, INEC professionals should be trained to effectively utilise broadcast media channels for information dissemination on electoral technologies.
... Although the observed effect sizes of the current study may not seem large at first glance (e.g., the average performance gap between Gen Zs and Millennials is 0.25 standard deviations, equivalent to approximately 0.6 points on the MIST, or 3 percentage points) -given how much time most people spend on the internet, how much misinformation circulates online, how easily it spreads (Vosoughi et al., 2018), and how consequential it may befrom decisions regarding vaccinations to decisions at the ballot boxthe real-world impact of these differences may be nontrivial . ...
Article
Full-text available
The global spread of misinformation poses a serious threat to the functioning of societies worldwide. But who falls for it? In this study, 66,242 individuals from 24 countries completed the Misinformation Susceptibility Test (MIST) and indicated their self-perceived misinformation discernment ability. Multilevel modelling showed that Generation Z, non-male, less educated, and more conservative individuals were more vulnerable to misinformation. Furthermore, while individuals' confidence in detecting misinformation was generally associated with better actual discernment, the degree to which perceived ability matched actual ability varied across subgroups. That is, whereas women were especially accurate in assessing their ability, extreme conservatives' perceived ability showed little relation to their actual misinformation discernment. Meanwhile, across all generations, Gen Z perceived their misinformation discernment ability most accurately, despite performing worst on the test. Taken together, our analyses provide the first systematic and holistic profile of misinformation susceptibility.
... Democratic processes and public authorities are targeted by disinformation Koistinen et al., 2022;Turčilo & Obrenović, 2020). At the same time, attention-grabbing false information is easier to produce and spreads much faster than serious information intended to correct it (Vosoughi et al., 2018;Ireton and Posetti, 2018). ...
Chapter
Full-text available
Whether the deployment of digital technology in higher education can be regarded as an opportunity or a risk is a question of values. The relevant values range from traditional academic values and the fundamental values of higher education in the EHEA to a democratic society and respect for human rights. Before COVID-19, it was customary to focus on potential benefits for learners. However, the earlier approach failed to pay enough attention to the wider educational and societal impacts of digital transformation. This changed during the COVID-19 pandemic. Recent studies have discussed the impact of the digital transformation of higher education on staff, institutions, the higher education market and society as a whole. In this chapter, it is argued that the digital transformation of higher education can have an impact on academic freedom, student and staff participation in higher education governance, public responsibility for higher education, public responsibility of higher education, and democracy. The digital transformation of higher education makes it necessary to balance multiple values. Higher education policy in the EHEA should take into account both opportunities and risks.
... At the core of this dynamic is personalization. Digital platforms utilize machine learning algorithms and extensive datasets to predict and prioritize content that most likely resonates with individual users [19] [20]. Personalization enhances content visibility and amplifies its emotional impact by optimizing for relevance. ...
... Currently, the uncontrolled proliferation of rumors exist on various social media platforms, e.g., Twitter and Weibo. People make comments on them and spread them, posing significant threats to cybersecurity and the safety of citizens' property [40]. For example, recent rumors claim that "a certain cryptocurrency will surge dramatically in the next few months," leading to excessive speculation among cryptocurrency investors. ...
Preprint
Over the past decade, social media platforms have been key in spreading rumors, leading to significant negative impacts. To counter this, the community has developed various Rumor Detection (RD) algorithms to automatically identify them using user comments as evidence. However, these RD methods often fail in the early stages of rumor propagation when only limited user comments are available, leading the community to focus on a more challenging topic named Rumor Early Detection (RED). Typically, existing RED methods learn from limited semantics in early comments. However, our preliminary experiment reveals that the RED models always perform best when the number of training and test comments is consistent and extensive. This inspires us to address the RED issue by generating more human-like comments to support this hypothesis. To implement this idea, we tune a comment generator by simulating expert collaboration and controversy and propose a new RED framework named CAMERED. Specifically, we integrate a mixture-of-expert structure into a generative language model and present a novel routing network for expert collaboration. Additionally, we synthesize a knowledgeable dataset and design an adversarial learning strategy to align the style of generated comments with real-world comments. We further integrate generated and original comments with a mutual controversy fusion module. Experimental results show that CAMERED outperforms state-of-the-art RED baseline models and generation methods, demonstrating its effectiveness.
... Postagens e vídeos visualmente simplificados, frequentemente manipulados para atrair mais atenção, distorcem conceitos científicos e comprometem a qualidade da informação [72]. Porém, publicações desse tipo costumam performar melhor nas métricas das plataformas, criando um ambiente de desvantagem para os enunciados complexos da ciência que estimulam a precariedade da informação [25]. ...
Article
Full-text available
Apesar de um aparente consenso sobre o valor estratégico da ciência para a sociedade e os ganhos materiais que o desenvolvimento de pesquisas e tecnologias agregam na indústria, na cultura e nas relações de poder, há um debate importante sobre a influência de valores externos, políticos e econômicos, sobre o conhecimento científico. Com a perspectiva conceitual da agnotologia, este artigo analisa como campanhas de propaganda utilizam a ética da informação das plataformas digitais para produzir e disseminar dúvida e ignorância sobre o conhecimento científico da mudança climática antrópica e seus porta-vozes. A partir de uma revisão da literatura, reforçamos o alerta de astrofísicos, geólogos, sociólogos e tantos outros cientistas sobre como a ideia de uma “ciência neutra” enfraquece a comunidade e alimenta o negacionismo. Observamos as principais estratégias e discursos que combatem o consenso sobre a emergência climática e sua relação com um modelo econômico insustentável de exploração dos recursos do planeta, apontando responsabilidades da indústria do petróleo, do agronegócio e das big techs nesse cenário.
Article
Full-text available
This article explores the Sufi thought of Ahmad Khatib al-Sambasi, an influential 19th-century Nusantara Sufi, and the characteristics of his Tarekat Qadiriyah wa Naqshabandiyah (TQN). Using a qualitative approach and library research methodology, the study analyzes al-Sambasi's works, particularly Fatḥ al-‘Ārifīn, alongside secondary literature on TQN and Sufism. The research employs documentation for data collection and content analysis for interpretation. Findings reveal that al-Sambasi developed a comprehensive spiritual system grounded in tauḥīd and ma‘rifah, featuring systematic dzikr methods and spiritual practices. His teachings successfully integrate Syarī‘ah and Ḥaqīqah, emphasizing a balance between exoteric and esoteric dimensions. The TQN's distinctive features include a hierarchical structure, specific rituals like bai‘at and talqīn dzikr, and strict ethical guidelines for followers. Al-Sambasi's significant contribution lies in adapting classical Sufism to the Nusantara context, fostering widespread acceptance in the region. The study concludes that al-Sambasi's thought offers a balanced and comprehensive spirituality model, relevant both in his time and in contemporary Islamic spirituality.
Article
Full-text available
Using network theory and data analysis, we study the messages on Twitter (X) about ecological sustainability over the period 2007-2022. With a global view of 70,311,541 messages we examined the sentiment, keywords and hashtags utilised, as well as the correlations between sentiment and both socioeconomic and environmental variables. In addition to the above, we carried out an in-depth analysis of the global interactions network (retweets, replies and quotes), with a special focus on the study of the community network (CNET) (with 4576 supernodes, and 9855 links). The sentiment shown in the text of the tweets was positive over the years in all analysed locations, although close to neutral. Keyword analysis detected terms present in tweets posted from various regions, showing global thinking in the world. The relationships between sentiment and variables examined were continent- and country-specific, identifying a stronger correlation with socioeconomic attributes. Regarding CNET, according to the study performed using adjacency and laplacian embeddings, as well as Chebyshev, Euclidean, Minkowski, and Manhattan distances, pairs of unconnected supernodes appeared to have more similarity in their connection patterns than pairs of connected supernodes, due to the topological structure of CNET which has a large number of peripheral nodes that are not connected to each other, but are connected to nodes with higher centrality. In agreement with the Jaccard coefficient, resource allocation index, Adamic Adar index, and preferential attachment score, there is little possibility of link formation between supernodes. Statistically the supernodes also exhibited high topological similarity. A few specific supernodes host most of the users, showing the highest centralities among those analysed. The basic structure of CNET, which maintained its key properties, was also examined. Strategies that promote communication between supernodes to achieve greater participation and diversity in discussions need to be further investigated.
Article
Full-text available
The announcement of LK-99 as a potential room-temperature, ambient-pressure superconductor sparked widespread debate across both traditional news outlets and social media platforms. This study investigates public perceptions and argumentation patterns surrounding LK-99 by applying sentiment analysis and computational argument mining to a diverse dataset. We analyzed 797 YouTube videos, 71,096 comments, and 1,329 news articles collected between 2023 and 2024. Our results reveal distinct sentiment trajectories: while news articles and YouTube posts exhibit fluctuating yet predominantly positive tones, user comments consistently maintain a more negative sentiment. Discourse analysis shows that structured argumentation—especially reasoning based on expert opinions, observable signs, and anticipated consequences—is prevalent in professionally curated content, whereas a significant proportion of user comments lack identifiable argumentation schemes. Moreover, channel-level analysis indicates that non-expert channels, despite their limited specialization in science, attract higher audience engagement than traditional science channels. These findings highlight the complexities of digital science communication and underscore the need for adaptive strategies that bridge the gap between expert evidence and public discourse. Our study provides practical recommendations to enhance public understanding of scientific advancements in digital spaces.
Article
In spite of the existing literature, which focuses mainly on Western contexts, this article argues that malicious activities on social media, particularly Twitter, are not limited to false information and bot activism in non-democratic societies. I argue that authoritarian regimes aim to suppress any kind of meaningful action on social media by bombarding users with a sheer of true and false messages. The aim is to create what Hannah Arendt calls a “non-thinking situation.” Therefore, this research hypothesizes that authoritarian regimes devise novel practices and agents to dismantle counternarratives on Twitter. In order to identify these practices and agents, this article focuses on Women, Life, Freedom, i.e., #MahsaAmini, movement on Persian Twitter. Drawing on Arendt’s ideas on storytelling and Informational Learned Helplessness (ILH) theory; I conducted a longitude digital ethnography combined with Social Media Critical Discourse Studies on Persian Twitter from April 2022 to February 2024. Results reveal that the Iranian regime used traditional techniques but also developed new strategies like fabricated stories and new actors like undercover agents to suppress the #MahsaAmini movement. This article empirically investigates these actors and strategies to contribute to the extant research on nefarious social media activities in contemporary societies.
Article
Aiming to contribute to the debate whether the Internet and in particular social networks are leading to echo chambers of fragmented groups or to public sphere, this article investigates the dynamics of echo chambers of followers of Turkish political youth groups on Twitter. It focuses on two classes: Official youth organizations of ruling party and main opposition party, and one independent group. Retrieving over 40 million tweets of 30 thousand followers of these groups, 5.5 million interactions between 2016 and 2018 were analyzed. Strong echo chambers are found, and no weakening observed with a small-scale exception through cross-ideology exposure by individuals following two groups. The results are discussed along with the political lines and the independence level of the groups.
Article
Purpose Scholars were interested in how in-group members perceived news about an out-group. The current study aims to explore the interaction effects of out-group news and comment congruence on news authentication and how the authentication triggers news sharing by considering social media news credibility. Design/methodology/approach The study used an experiment with a 2 (real vs fake news) × 2 (supportive vs opposite comments) factorial design to understand news authentication and follow-up behaviors. Findings The results revealed that the interaction between news types and comment congruency has positive effects on institutional and interpersonal intentional authentication. Supportive comments lead to a higher level of institutional and interpersonal intentional authentication when individuals encounter real news rather than fake news. Interpersonal authentication mediated the effects of comment congruence on news sharing. The effects of comment congruence on interpersonal authentication got stronger among individuals who perceived social media news credibility as lower than high by controlling news credibility. Originality/value The study expands the framework of news authentication by examining the effects of message cues on news behavior, considering the dual influence of hostile media and social credibility assessment. The current study makes a great impact on understanding the misinformation phenomenon and news authentication as a pro-news behavior. The study also provides journalists and news agencies with insights into news practices on social media. Peer review The peer review history for this article is available at: https://publons.com/publon/10.1108/OIR-03-2024-0204 .
Article
The abundance of information on social media, partly conflicting with government information, might negatively affect citizens’ compliance with policies. Based on Dutch representative survey data from the COVID-19 pandemic, we find that citizens who ranked social media as a more important information resource were generally less compliant with COVID-19 measures and less willing to get vaccinated. A higher ranking of social media is more strongly associated with non-compliance among citizens with lower levels of institutional trust. Based on these findings, we suggest that efforts to encourage compliance should focus not only on countering misinformation, but also on enhancing institutional trust. Points for practitioners Citizens who ranked social media as a more important information resource on COVID-19 were less compliant with government COVID-19 measures and less willing to get vaccinated. This relationship was strongest among citizens with low levels of trust in the institutions of government involved in managing the pandemic. To enhance compliance with policy measures, government efforts should focus not only on countering misinformation, but also on enhancing trust in government institutions.
Article
Full-text available
The high usage of social media for spreading real-time content is evident today, encompassing users across various age groups and demographics. Its popularity stems from ease of communication, rapid propagation, accessibility, and low cost. However, social media has become a double-edged sword, spreading both real and fake content. Users exploit it to disseminate hoaxes, fake news, and malicious data for entertainment, politics, and business purposes. This paper proposes ensemble approaches that evaluate content authenticity through machine learning models. By combining Logistic Regression, Decision Tree, Gradient Boosting, Support Vector Machine, and Naïve Bayes classifiers, we achieved 92% accuracy in classifying fake and legitimate content. Individual models reached 86% accuracy.
Chapter
Fake news on the internet refers to intentionally concocted or falsely presented licit information. The purpose is to negatively influence public opinion for several reasons, which include political agendas, monetary profit, and propaganda. Disinformation on the internet is circulated through websites and social media. It can take various forms, including but not limited to erroneous headlines, falsified stories, distorted images, and fabricated content mirroring truthful news reports. Fake news flourishes in areas where trust in credible media sources is low and political division is high. The propagation of false data across the internet is rooted in technology, psychology, and society. This concept likens fake news to infectious disease. Fake news on the web can be transmitted from person to person. By referring to fake news as a contagious disease, it is easier to understand the variables affected, the rapid spread, the detrimental effects, and the vigorous steps necessary to combat the problem.
Chapter
This chapter explores the impact of dual narratives propagated through social media on India's public perception and political discourse from 2014 to 2024. While social media plays a key role in shaping public opinion, it also spreads misinformation and fosters polarization. The study uses a mixed-methods approach, analyzing 10 major incidents and surveying 50 respondents. The results show that respondents struggle with conflicting narratives and express mistrust in social media information. Regression analysis reveals that concerns about national unity and future divisions outweigh personal perceptions. The study emphasizes the need for stricter regulation of social media platforms to curb misinformation and polarization. Respondents overwhelmingly support regulation, indicating public demand for action. The research highlights the importance of collective efforts to create a reliable and cohesive digital environment to protect democratic processes and societal stability.
Article
Introducción: Los medios sociales son los más populares entre los jóvenes: se identifican con los contenidos y sienten que forman parte de un colectivo. Se analizan los contenidos de los principales influencers en España y Chile para saber: 1) de qué hablan e identificar si se trata de contenidos informacionales, educativos o de entretenimiento, 2) determinar su calidad y si, eventualmente, propician la desinformación y tienden a la banalización; y 3) reflexionar sobre la calidad de los contenidos y cómo pueden afectar en la configuración de la dieta mediática de los jóvenes. Metodología: Se analizan 12 cuentas de influencers en Instagram, TikTok y YouTube a través de 439 contenidos. Se propone un análisis comparativo de contenido que combina el método cualitativo y el cuantitativo. Resultados: Los influencers hablan de una gran variedad de temas, pero priorizan exponer su vida personal desde la vertiente del entretenimiento. Se constata la escasa calidad de los contenidos, que tienden a la banalización. Discusión: La investigación presenta como novedad que se fija en la calidad de los contenidos (más allá de los temas de los que se habla) y en cómo pueden influir en sus seguidores. Conclusiones: Identificar esta trivialización de los contenidos puede contribuir a desarrollar políticas públicas y programas formativos en alfabetización mediática y fomentar la regulación y autorregulación de contenidos en los medios sociales, por el impacto en la salud mental de los jóvenes, que están construyendo su identidad.
Article
Mass media has historically served as a fundamental pillar for the distribution of public information, entertainment, and social unity. The researches and general observation highlight that in recent years, the viewership for conventional mainstream media—radio, television, and newspapers—has been consistently diminishing. Concurrently, there are serious debates regarding the diminishing trust in the information provided by the mainstream mass media. This study examines the consumption of mass media by youth by deploying both quantitative and qualitative tools. Furthermore, the perceived partiality, insufficient variety, and declining faith in conventional media have estranged younger demographics. This paper analyzes the fundamental causes of this fall, its ramifications for the Indian media landscape, and solutions for conventional media to maintain relevance in a digitally dominated environment. Key words: Radio Listener, Television Viewers, Newspaper Readers, Youth, Trust in Mass Media as Source of Information, Social Media
Preprint
Full-text available
The widespread emergence of manipulated news media content poses significant challenges to online information integrity. This study investigates whether dialogues with AI about AI-generated images and associated news statements can increase human discernment abilities and foster short-term learning in detecting misinformation. We conducted a study with 80 participants who engaged in structured dialogues with an AI system about news headline-image pairs, generating 1,310 human-AI dialogue exchanges. Results show that AI interaction significantly boosts participants' accuracy in identifying real versus fake news content from approximately 60\% to 90\% (p<<0.001). However, these improvements do not persist when participants are presented with new, unseen image-statement pairs without AI assistance, with accuracy returning to baseline levels (~60\%, p=0.88). These findings suggest that while AI systems can effectively change immediate beliefs about specific content through persuasive dialogue, they may not produce lasting improvements that transfer to novel examples, highlighting the need for developing more effective interventions that promote durable learning outcomes.
Article
The proliferation of erroneous information on social media has a deleterious effect on both people and society. In order to mitigate the drawbacks of social media, it is crucial to distinguish between authentic and misleading information. The proposed research presents a novel method to tackle the issue of identifying misinformation on social media. The main objective is to reframe the false news detection issue as an optimization problem and use two specialized metaheuristic algorithms, salp swarm optimization and grey wolf optimization, to address it. The proposed detection method is a three-step model wherein pre-processing the data is the fundamental step, the second step involves modifying grey wolf optimization and salp swarm optimization thereby creating a new false news detection model while the final step involves the testing of the proposed false news detection model. Three separate real-world datasets have been used for training the proposed false news detection model, conducting the data analysis, performing the statistical tests, benchmarking the proposed algorithms, and generating fruitful insights through reporting and visualization. The findings demonstrate that amongst the existing artificial intelligence algorithms tested so far, the grey wolf optimization algorithm outperforms (accuracy=0.97, precision=0.97, recall=1.0,f-score=0.98) salp swarm optimization in addressing various social media issues.
Article
Full-text available
Հետճշմարտության ժամանակաշրջանը զգալի խոչընդոտներ է ներկայացնում հանրային ոլորտում անհատների ազատ և ինքնավար մտածողության հնարավորության, տեղեկացված որոշումների կայացման տեսանկյունից։ Հոդվածում հեղինակը հիմնվում է 18-րդ դարի գերմանացի փիլիսոփա Իմանուել Կանտի՝ բանականության հանրային գործադրման գաղափարի վրա՝ շեշտադրելով, որ լուսավորական արժեքներից նահանջելը վտանգում է հանրային ռացիոնալ դիսկուրսի և ժողովրդավարության ապագան։ Ժամանակակից իրականության մեջ բանականության հանրային գործադրման կանտյան պահանջը կամ իդեալը բախվում է մի քանի նշանակալի մարտահրավերների, որոնք կապված են լուսավորական «ընդհանուր պատկերացումներից» կամ տեսլականից հետ կանգնելու կամ առնվազն շեղվելու հետ։ Այս հոդվածում անդրադարձ կարվի դրանցից առավել հրատապներին՝ թյուրտեղեկատվություն (misinformation) և ապատեղեկատվություն (disinformation), սոցիալական մեդիայի ալգորիթմներ և արձագանքի խցիկներ, հուզական մանիպուլյացիա և պոպուլիզմ։ Թվարկված գործոններից յուրաքանչյուրն իր հերթին խաթարում է այն պայմանները, որոնք անհրաժեշտ են անհատներին հանրային ոլորտում իրենց բանականությունն ազատ և ինքնուրույն գործադրելու համար: Այնուամենայնիվ, կրթության նպատակների վերանայման, ավանդական ինստիտուտների նկատմամբ վստահության վերականգնման, թվային տեխնոլոգիաների շուրջ առավել մտածված ռազմավարությունների կիրառման, հանրային ճշմարտությունների վերակառուցման «մեծ» քաղաքականության միջոցով դեռևս հնարավոր է տարածքներ ստեղծել ազատ քննարկման և ժողովրդավարական ներգրավվածության համար՝ խթանելով ազատ և ինքնավար մտածողությունն այսօրվա աշխարհում:
Article
Full-text available
The World Economic Forum listed massive digital misinformation as one of the main threats for our society. The spreading of unsubstantiated rumors may have serious consequences on public opinion such as in the case of rumors about Ebola causing disruption to health-care workers. In this work we target Facebook to characterize information consumption patterns of 1.2 M Italian users with respect to verified (science news) and unverified (conspiracy news) contents. Through a thorough quantitative analysis we provide important insights about the anatomy of the system across which misinformation might spread. In particular, we show that users’ engagement on verified (or unverified) content correlates with the number of friends having similar consumption patterns (homophily). Finally, we measure how this social system responded to the injection of 4,709 false information. We find that the frequent (and selective) exposure to specific kind of content (polarization) is a good proxy for the detection of homophile clusters where certain kind of rumors are more likely to spread.
Conference Paper
Full-text available
While most online social media accounts are controlled by humans, these platforms also host automated agents called social bots or sybil accounts. Recent literature reported on cases of social bots imitating humans to manipulate discussions, alter the popularity of users, pollute content and spread misinformation, and even perform terrorist propaganda and recruitment actions. Here we present BotOrNot, a publicly-available service that leverages more than one thousand features to evaluate the extent to which a Twitter account exhibits similarity to the known characteristics of social bots. Since its release in May 2014, BotOrNot has served over one million requests via our website and APIs.
Article
Full-text available
Spam in online social networks (OSNs) is a systemic problem that imposes a threat to these services in terms of undermining their value to advertisers and potential investors, as well as negatively affecting users’ engagement. As spammers continuously keep creating newer accounts and evasive techniques upon being caught, a deeper understanding of their spamming strategies is vital to the design of future social media defense mechanisms. In this work, we present a unique analysis of spam accounts in OSNs viewed through the lens of their behavioral characteristics. Our analysis includes over 100 million messages collected from Twitter over the course of 1 month. We show that there exist two behaviorally distinct categories of spammers and that they employ different spamming strategies. Then, we illustrate how users in these two categories demonstrate different individual properties as well as social interaction patterns. Finally, we analyze the detectability of spam accounts with respect to three categories of features, namely content attributes, social interactions, and profile properties.
Article
Full-text available
Significance The wide availability of user-provided content in online social media facilitates the aggregation of people around common interests, worldviews, and narratives. However, the World Wide Web is a fruitful environment for the massive diffusion of unverified rumors. In this work, using a massive quantitative analysis of Facebook, we show that information related to distinct narratives––conspiracy theories and scientific news––generates homogeneous and polarized communities (i.e., echo chambers) having similar information consumption patterns. Then, we derive a data-driven percolation model of rumor spreading that demonstrates that homogeneity and polarization are the main determinants for predicting cascades’ size.
Conference Paper
Full-text available
The goal of this work is to introduce a simple modeling framework to study the diffusion of hoaxes and in particular how the availability of debunking information may contain their diffusion. As traditionally done in the mathematical modeling of information diffusion processes, we regard hoaxes as viruses: users can become infected if they are exposed to them, and turn into spreaders as a consequence. Upon verification, users can also turn into non-believers and spread the same attitude with a mechanism analogous to that of the hoax-spreaders. Both believers and non-believers, as time passes, can return to a susceptible state. Our model is characterized by four parameters: spreading rate, gullibility, probability to verify a hoax, and that to forget one's current belief. Simulations on homogeneous, heterogeneous, and real networks for a wide range of parameters values reveal a threshold for the fact-checking probability that guarantees the complete removal of the hoax from the network. Via a mean field approximation, we establish that the threshold value does not depend on the spreading rate but only on the gullibility and forgetting probability. Our approach allows to quantitatively gauge the minimal reaction necessary to eradicate a hoax.
Article
Full-text available
Traditional fact checking by expert journalists cannot keep up with the enormous volume of information that is now generated online. Computational fact checking may significantly enhance our ability to evaluate the veracity of dubious information. Here we show that the complexities of human fact checking can be approximated quite well by finding the shortest path between concept nodes under properly defined semantic proximity metrics on knowledge graphs. Framed as a network problem this approach is feasible with efficient computational techniques. We evaluate this approach by examining tens of thousands of claims related to history, entertainment, geography, and biographical information using a public knowledge graph extracted from Wikipedia. Statements independently known to be true consistently receive higher support via our method than do false ones. These findings represent a significant step toward scalable computational fact-checking methods that may one day mitigate the spread of harmful misinformation.
Article
Full-text available
The Boston Marathon bombing story unfolded on every possible carrier of information available in the spring of 2013, including Twitter. As information spread, it was filled with rumors (unsubstantiated information), and many of these rumors contained misinformation. Earlier studies have suggested that crowdsourced information flows can correct misinformation, and our research investigates this proposition. This exploratory research examines three rumors, later demonstrated to be false, that circulated on Twitter in the aftermath of the bombings. Our findings suggest that corrections to the misinformation emerge but are muted compared with the propagation of the misinformation. The similarities and differences we observe in the patterns of the misinformation and corrections contained within the stream over the days that followed the attacks suggest directions for possible research strategies to automatically detect misinformation.
Article
Full-text available
The large availability of user provided contents on online social media facilitates people aggregation around common interests, worldviews and narratives. However, in spite of the enthusiastic rhetoric about the so called {\em wisdom of crowds}, unsubstantiated rumors -- as alternative explanation to main stream versions of complex phenomena -- find on the Web a natural medium for their dissemination. In this work we study, on a sample of 1.2 million of individuals, how information related to very distinct narratives -- i.e. main stream scientific and alternative news -- are consumed on Facebook. Through a thorough quantitative analysis, we show that distinct communities with similar information consumption patterns emerge around distinctive narratives. Moreover, consumers of alternative news (mainly conspiracy theories) result to be more focused on their contents, while scientific news consumers are more prone to comment on alternative news. We conclude our analysis testing the response of this social system to 4709 troll information -- i.e. parodistic imitation of alternative and conspiracy theories. We find that, despite the false and satirical vein of news, usual consumers of conspiracy news are the most prone to interact with them.
Article
Full-text available
The Turing test asked whether one could recognize the behavior of a human from that of a computer algorithm. Today this question has suddenly become very relevant in the context of social media, where text constraints limit the expressive power of humans, and real incentives abound to develop human-mimicking software agents called social bots. These elusive entities wildly populate social media ecosystems, often going unnoticed among the population of real people. Bots can be benign or harmful, aiming at persuading, smearing, or deceiving. Here we discuss the characteristics of modern, sophisticated social bots, and how their presence can endanger online ecosystems and our society. We then discuss current efforts aimed at detection of social bots in Twitter. Characteristics related to content, network, sentiment, and temporal patterns of activity are imitated by bots but at the same time can help discriminate synthetic behaviors from human ones, yielding signatures of engineered social tampering.
Conference Paper
Full-text available
In today's world, online social media plays a vital role during real world events, especially crisis events. There are both positive and negative effects of social media coverage of events, it can be used by authorities for effective disaster management or by malicious entities to spread rumors and fake news. The aim of this paper, is to highlight the role of Twitter, during Hurricane Sandy (2012) to spread fake images about the disaster. We identified 10,350 unique tweets containing fake images that were circulated on Twitter, during Hurricane Sandy. We performed a characterization analysis, to understand the temporal, social reputation and influence patterns for the spread of fake images. Eighty six percent of tweets spreading the fake images were retweets, hence very few were original tweets. Our results showed that top thirty users out of 10,215 users (0.3%) resulted in 90% of the retweets of fake images; also network links such as follower relationships of Twitter, contributed very less (only 11%) to the spread of these fake photos URLs. Next, we used classification models, to distinguish fake images from real images of Hurricane Sandy. Best results were obtained from Decision Tree classifier, we got 97% accuracy in predicting fake images from real. Also, tweet based features were very effective in distinguishing fake images tweets from real, while the performance of user based features was very poor. Our results, showed that, automated techniques can be used in identifying real images from fake images posted on Twitter.
Conference Paper
Full-text available
Characterizing information diffusion on social platforms like Twitter enables us to understand the properties of underlying media and model communication patterns. As Twitter gains in popularity, it has also become a venue to broadcast rumors and misinformation. We use epidemiological models to characterize information cascades in twitter resulting from both news and rumors. Specifically, we use the SEIZ enhanced epidemic model that explicitly recognizes skeptics to characterize eight events across the world and spanning a range of event types. We demonstrate that our approach is accurate at capturing diffusion in these events. Our approach can be fruitfully combined with other strategies that use content modeling and graph theoretic features to detect (and possibly disrupt) rumors.
Conference Paper
Full-text available
Twitter is useful in a situation of disaster for communication, announcement, request for rescue and so on. On the other hand, it causes a negative by-product, spreading rumors. This paper describe how rumors have spread after a disaster of earthquake, and discuss how can we deal with them. We first investigated actual instances of rumor after the disaster. And then we attempted to disclose characteristics of those rumors. Based on the investigation we developed a system which detects candidates of rumor from twitter and then evaluated it. The result of experiment shows the proposed algorithm can find rumors with acceptable accuracy.
Article
Full-text available
Detecting emotions in microblogs and social media posts has applications for industry, health, and security. Statistical, supervised automatic methods for emotion detection rely on text that is labeled for emotions, but such data are rare and available for only a handful of basic emotions. In this article, we show that emotion‐word hashtags are good manual labels of emotions in tweets. We also propose a method to generate a large lexicon of word–emotion associations from this emotion‐labeled tweet corpus. This is the first lexicon with real‐valued word–emotion association scores. We begin with experiments for six basic emotions and show that the hashtag annotations are consistent and match with the annotations of trained judges. We also show how the extracted tweet corpus and word–emotion associations can be used to improve emotion classification accuracy in a different nontweet domain. Eminent psychologist Robert Plutchik had proposed that emotions have a relationship with personality traits. However, empirical experiments to establish this relationship have been stymied by the lack of comprehensive emotion resources. Because personality may be associated with any of the hundreds of emotions and because our hashtag approach scales easily to a large number of emotions, we extend our corpus by collecting tweets with hashtags pertaining to 585 fine emotions. Then, for the first time, we present experiments to show that fine emotion categories such as those of excitement, guilt, yearning, and admiration are useful in automatically detecting personality from text. Stream‐of‐consciousness essays and collections of Facebook posts marked with personality traits of the author are used as test sets.
Article
Full-text available
The announcement of the discovery of a Higgs boson-like particle at CERN will be remembered as one of the milestones of the scientific endeavor of the 21(st) century. In this paper we present a study of information spreading processes on Twitter before, during and after the announcement of the discovery of a new particle with the features of the elusive Higgs boson on 4(th) July 2012. We report evidence for non-trivial spatio-temporal patterns in user activities at individual and global level, such as tweeting, re-tweeting and replying to existing tweets. We provide a possible explanation for the observed time-varying dynamics of user activities during the spreading of this scientific "rumor". We model the information spreading in the corresponding network of individuals who posted a tweet related to the Higgs boson discovery. Finally, we show that we are able to reproduce the global behavior of about 500,000 individuals with remarkable accuracy.
Article
Full-text available
Even though considerable attention has been given to the polarity of words (positive and negative) and the creation of large polarity lexicons, research in emotion analysis has had to rely on limited and small emotion lexicons. In this paper we show how the combined strength and wisdom of the crowds can be used to generate a large, high-quality, word-emotion and word-polarity association lexicon quickly and inexpensively. We enumerate the challenges in emotion annotation in a crowdsourcing scenario and propose solutions to address them. Most notably, in addition to questions about emotions associated with terms, we show how the inclusion of a word choice question can discourage malicious data entry, help identify instances where the annotator may not be familiar with the target term (allowing us to reject such annotations), and help obtain annotations at sense level (rather than at word level). We conducted experiments on how to formulate the emotion-annotation questions, and show that asking if a term is associated with an emotion leads to markedly higher inter-annotator agreement than that obtained by asking if a term evokes an emotion.
Conference Paper
Full-text available
Models of networked diffusion that are motivated by analogy with the spread of infectious disease have been applied to a wide range of social and economic adoption processes, including those related to new products, ideas, norms and behaviors. However, it is unknown how accurately these models account for the empirical structure of diffusion over networks. Here we describe the diffusion patterns arising from seven online domains, ranging from communications platforms to networked games to microblogging services, each involving distinct types of content and modes of sharing. We find strikingly similar patterns across all domains. In particular, the vast majority of cascades are small, and are described by a handful of simple tree structures that terminate within one degree of an initial adopting "seed." In addition we find that structures other than these account for only a tiny fraction of total adoptions; that is, adoptions resulting from chains of referrals are extremely rare. Finally, even for the largest cascades that we observe, we find that the bulk of adoptions often takes place within one degree of a few dominant individuals. Together, these observations suggest new directions for modeling of online adoption processes.
Conference Paper
Full-text available
In this article we explore the behavior of Twitter users under an emergency situation. In particular, we analyze the activity related to the 2010 earthquake in Chile and characterize Twitter in the hours and days following this disaster. Furthermore, we perform a pre-liminary study of certain social phenomenons, such as the dissem-ination of false rumors and confirmed news. We analyze how this information propagated through the Twitter network, with the pur-pose of assessing the reliability of Twitter as an information source under extreme circumstances. Our analysis shows that the propa-gation of tweets that correspond to rumors differs from tweets that spread news because rumors tend to be questioned more than news by the Twitter community. This result shows that it is posible to detect rumors by using aggregate analysis on tweets.
Conference Paper
Full-text available
Even though considerable attention has been given to semantic orientation of words and the creation of large polarity lexicons, research in emotion analysis has had to rely on limited and small emotion lexicons. In this paper, we show how we create a high-quality, moderate-sized emotion lexicon using Mechanical Turk. In addition to questions about emotions evoked by terms, we show how the inclusion of a word choice question can discourage malicious data entry, help identify instances where the annotator may not be familiar with the target term (allowing us to reject such annotations), and help obtain annotations at sense level (rather than at word level). We perform an extensive analysis of the annotations to better understand the distribution of emotions evoked by terms of different parts of speech. We identify which emotions tend to be evoked simultaneously by the same term and show that certain emotions indeed go hand in hand.
Conference Paper
Full-text available
Due to its rapid speed of information spread, wide user bases, and extreme mobility, Twitter is drawing attention as a potential emergency reporting tool under extreme events. However, at the same time, Twitter is sometimes despised as a citizen based non-professional social medium for propagating misinformation, rumors, and, in extreme case, propaganda. This study explores the working dynamics of the rumor mill by analyzing Twitter data of the Haiti Earthquake in 2010. For this analysis, two key variables of anxiety and informational uncertainty are derived from rumor theory, and their interactive dynamics are measured by both quantitative and qualitative methods. Our research finds that information with credible sources contribute to suppress the level of anxiety in Twitter community, which leads to rumor control and high information quality.
Conference Paper
Full-text available
Twitter as a new form of social media can potentially contain much useful information, but content analysis on Twitter has not been well studied. In particular, it is not clear whether as an information source Twitter can be simply regarded as a faster news feed that covers mostly the same information as traditional news media. In This paper we empirically compare the content of Twitter with a traditional news medium, New York Times, using unsupervised topic modeling. We use a Twitter-LDA model to discover topics from a representative sample of the entire Twitter. We then use text mining techniques to compare these Twitter topics with topics from New York Times, taking into consideration topic categories and types. We also study the relation between the proportions of opinionated tweets and retweets and topic categories and types. Our comparisons show interesting and useful findings for downstream IR or DM applications.
Conference Paper
Full-text available
We analyze the information credibility of news propagated through Twitter, a popular microblogging service. Previous research has shown that most of the messages posted on Twitter are truthful, but the service is also used to spread misinformation and false rumors, often unintentionally. On this paper we focus on automatic methods for assessing the credibility of a given set of tweets. Specifically, we analyze microblog postings related to "trending" topics, and classify them as credible or not credible, based on features extracted from them. We use features from users' posting and re-posting ("re-tweeting") behavior, from the text of the posts, and from citations to external sources. We evaluate our methods using a significant number of human assessments about the credibility of items on a recent sample of Twitter postings. Our results shows that there are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.
Article
Full-text available
Introduced the statistic kappa to measure nominal scale agreement between a fixed pair of raters. Kappa was generalized to the case where each of a sample of 30 patients was rated on a nominal scale by the same number of psychiatrist raters (n = 6), but where the raters rating 1 s were not necessarily the same as those rating another. Large sample standard errors were derived.
Article
The spread of malicious or accidental misinformation in social media, especially in time-sensitive situations, such as real-world emergencies, can have harmful effects on individuals and society. In this work, we developed models for automated verification of rumors (unverified information) that propagate through Twitter. To predict the veracity of rumors, we identified salient features of rumors by examining three aspects of information spread: linguistic style used to express rumors, characteristics of people involved in propagating information, and network propagation dynamics. The predicted veracity of a time series of these features extracted from a rumor (a collection of tweets) is generated using Hidden Markov Models. The verification algorithm was trained and tested on 209 rumors representing 938,806 tweets collected from real-world events, including the 2013 Boston Marathon bombings, the 2014 Ferguson unrest, and the 2014 Ebola epidemic, and many other rumors about various real-world events reported on popular websites that document public rumors. The algorithm was able to correctly predict the veracity of 75% of the rumors faster than any other public source, including journalists and law enforcement officials. The ability to track rumors and predict their outcomes may have practical applications for news consumers, financial markets, journalists, and emergency services, and more generally to help minimize the impact of false information on Twitter.
Conference Paper
Cascades of information-sharing are a primary mechanism by which content reaches its audience on social media, and an active line of research has studied how such cascades, which form as content is reshared from person to person, develop and subside. In this paper, we perform a large-scale analysis of cascades on Facebook over significantly longer time scales, and find that a more complex picture emerges, in which many large cascades recur, exhibiting multiple bursts of popularity with periods of quiescence in between. We characterize recurrence by measuring the time elapsed between bursts, their overlap and proximity in the social network, and the diversity in the demographics of individuals participating in each peak. We discover that content virality, as revealed by its initial popularity, is a main driver of recurrence, with the availability of multiple copies of that content helping to spark new bursts. Still, beyond a certain popularity of content, the rate of recurrence drops as cascades start exhausting the population of interested individuals. We reproduce these observed patterns in a simple model of content recurrence simulated on a real social network. Using only characteristics of a cascade's initial burst, we demonstrate strong performance in predicting whether it will recur in the future.
Conference Paper
We present Tweet2Vec, a novel method for generating general-purpose vector representation of tweets. The model learns tweet embeddings using character-level CNN-LSTM encoder-decoder. We trained our model on 3 million, randomly selected English-language tweets. The model was evaluated using two methods: tweet semantic similarity and tweet sentiment categorization, outperforming the previous state-of-the-art in both tasks. The evaluations demonstrate the power of the tweet embeddings generated by our model for various tweet categorization tasks. The vector representations generated by our model are generic, and hence can be applied to a variety of tasks. Though the model presented in this paper is trained on English-language tweets, the method presented can be used to learn tweet embeddings for different languages.
Conference Paper
Many previous techniques identify trending topics in social media, even topics that are not pre-defined. We present a technique to identify trending rumors, which we define as topics that include disputed factual claims. Putting aside any attempt to assess whether the rumors are true or false, it is valuable to identify trending rumors as early as possible. It is extremely difficult to accurately classify whether every individual post is or is not making a disputed factual claim. We are able to identify trending rumors by recasting the problem as finding entire clusters of posts whose topic is a disputed factual claim. The key insight is that when there is a rumor, even though most posts do not raise questions about it, there may be a few that do. If we can find signature text phrases that are used by a few people to express skepticism about factual claims and are rarely used to express anything else, we can use those as detectors for rumor clusters. Indeed, we have found a few phrases that seem to be used exactly that way, including: "Is this true?", "Really?", and "What?". Relatively few posts related to any particular rumor use any of these enquiry phrases, but lots of rumor diffusion processes have some posts that do and have them quite early in the diffusion. We have developed a technique based on searching for the enquiry phrases, clustering similar posts together, and then collecting related posts that do not contain these simple phrases. We then rank the clusters by their likelihood of really containing a disputed factual claim. The detector, which searches for the very rare but very informative phrases, combined with clustering and a classifier on the clusters, yields surprisingly good performance. On a typical day of Twitter, about a third of the top 50 clusters were judged to be rumors, a high enough precision that human analysts might be willing to sift through them.
Article
Twitter has become one of the main sources of news for many people. As real-world events and emergencies unfold, Twitter is abuzz with hundreds of thousands of stories about the events. Some of these stories are harmless, while others could potentially be life-saving or sources of malicious rumors. Thus, it is critically important to be able to efficiently track stories that spread on Twitter during these events. In this paper, we present a novel semi-automatic tool that enables users to efficiently identify and track stories about real-world events on Twitter. We ran a user study with 25 participants, demonstrating that compared to more conventional methods, our tool can increase the speed and the accuracy with which users can track stories about real-world events.
Article
Online social networks provide a rich substrate for rumor propagation. Information received via friends tends to be trusted, and online social networks allow individuals to transmit information to many friends at once. By referencing known rumors from Snopes.com, a popular website documenting memes and urban legends, we track the propagation of thousands of rumors appearing on Facebook. From this sample we infer the rates at which rumors from different categories and of varying truth value are uploaded and reshared. We find that rumor cascades run deeper in the social network than reshare cascades in general. We then examine the effect of individual reshares receiving a comment containing a link to a Snopes article on the evolution of the cascade. We find that receiving such a comment increases the likelihood that a reshare of a rumor will be deleted. Furthermore, large cascades are able to accumulate hundreds of Snopes comments while continuing to propagate. Finally, using a dataset of rumors copied and pasted from one status update to another, we show that rumors change over time and that different variants tend to dominate different bursts in popularity. Copyright © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Article
Though Twitter acts as a realtime news source with people acting as sensors and sending event updates from all over the world, rumors spread via Twitter have been noted to cause considerable damage. Given a set of popular Twitter events along with related users and tweets, we study the problem of automatically assessing the credibility of such events. We propose a credibility analysis approach enhanced with event graph-based optimization to solve the problem. First we experiment by performing PageRanklike credibility propagation on a multi-typed network consisting of events, tweets, and users. Further, within each iteration, we enhance the basic trust analysis by updating event credibility scores using regularization on a new graph of events. Our experiments using events extracted from two tweet feed datasets, each with millions of tweets show that our event graph optimization approach outperforms the basic credibility analysis approach. Also, our methods are significantly more accurate (∼86%) than the decision tree classifier approach (∼72%). Copyright
Article
Twitter and other social media are now a major method of information exchange and dissemination. Although they can support rapid communication and sharing of useful information, they can also facilitate the spread of rumors, which contain unverified information. The purpose of the work reported here was to examine several design ideas for reducing the spread of health-related rumors in a Twitter-like environment. The results have shown that exposing people to information that refutes rumors or warns that the statement has appeared on rumor websites could reduce the spread of rumors. These results suggest that social media technologies can be designed such that users can self correct and inactivate potentially inaccurate information in their environment.
Conference Paper
Social media have become an established feature of the dynamic information space that emerges during crisis events. Both emergency responders and the public use these platforms to search for, disseminate, challenge, and make sense of information during crises. In these situations rumors also proliferate, but just how fast such information can spread is an open question. We address this gap, modeling the speed of information transmission to compare retransmission times across content and context features. We specifically contrast rumor-affirming messages with rumor-correcting messages on Twitter during a notable hostage crisis to reveal differences in transmission speed. Our work has important implications for the growing field of crisis informatics.
Article
Viral products and ideas are intuitively understood to grow through a person-to-person diffusion process analogous to the spread of an infectious disease; however, until recently it has been prohibitively difficult to directly observe purportedly viral events, and thus to rigorously quantify or characterize their structural properties. Here we propose a formal measure of what we label “structural virality” that interpolates between two conceptual extremes: content that gains its popularity through a single, large broadcast and that which grows through multiple generations with any one individual directly responsible for only a fraction of the total adoption. We use this notion of structural virality to analyze a unique data set of a billion diffusion events on Twitter, including the propagation of news stories, videos, images, and petitions. We find that across all domains and all sizes of events, online diffusion is characterized by surprising structural diversity; that is, popular events regularly grow via both broadcast and viral mechanisms, as well as essentially all conceivable combinations of the two. Nevertheless, we find that structural virality is typically low, and remains so independent of size, suggesting that popularity is largely driven by the size of the largest broadcast. Finally, we attempt to replicate these findings with a model of contagion characterized by a low infection rate spreading on a scale-free network. We find that although several of our empirical findings are consistent with such a model, it fails to replicate the observed diversity of structural virality, thereby suggesting new directions for future modeling efforts. This paper was accepted by Lorin Hitt, information systems.
Article
We consider statistical inference for regression when data are grouped into clusters, with regression model errors independent across clusters but correlated within clusters. Examples include data on individuals with clustering on village or region or other category such as industry, and state- year differences- in- differences studies with clustering on state. In such settings, default standard errors can greatly overstate estimator precision. Instead, if the number of clusters is large, statistical inference after OLS should be based on cluster- robust standard errors. We outline the basic method as well as many complications that can arise in practice. These include cluster- specifi c fi xed effects, few clusters, multiway clustering, and estimators other than OLS. © 2015 by the Board of Regents of the University of Wisconsin System.
Article
Many machine learning algorithms require the input to be represented as a fixed-length feature vector. When it comes to texts, one of the most common fixed-length features is bag-of-words. Despite their popularity, bag-of-words features have two major weaknesses: they lose the ordering of the words and they also ignore semantics of the words. For example, "powerful," "strong" and "Paris" are equally distant. In this paper, we propose Paragraph Vector, an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents. Our algorithm represents each document by a dense vector which is trained to predict words in the document. Its construction gives our algorithm the potential to overcome the weaknesses of bag-of-words models. Empirical results show that Paragraph Vectors outperforms bag-of-words models as well as other techniques for text representations. Finally, we achieve new state-of-the-art results on several text classification and sentiment analysis tasks.
Article
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.
Article
A few hubs with many connections share with many individuals with few connections.
Article
Bell System Technical Journal, also pp. 623-656 (October)
Article
Minimization of the error probability to determine optimum signals is often difficult to carry out. Consequently, several suboptimum performance measures that are easier than the error probability to evaluate and manipulate have been studied. In this partly tutorial paper, we compare the properties of an often used measure, the divergence, with a new measure that we have called the Bhattacharyya distance. This new distance measure is often easier to evaluate than the divergence. In the problems we have worked, it gives results that are at least as good as, and are often better, than those given by the divergence.
Conference Paper
In this paper we study and evaluate rumor-like methods for combating the spread of rumors on a social network. We model rumor spread as a diffusion process on a network and suggest the use of an "anti-rumor" process similar to the rumor process. We study two natural models by which these anti-rumors may arise. The main metrics we study are the belief time, i.e., the duration for which a person believes the rumor to be true and point of decline, i.e., point after which anti-rumor process dominates the rumor process. We evaluate our methods by simulating rumor spread and anti-rumor spread on a data set derived from the social networking site Twitter and on a synthetic network generated according to the Watts and Strogatz model. We find that the lifetime of a rumor increases if the delay in detecting it increases, and the relationship is at least linear. Further our findings show that coupling the detection and anti-rumor strategy by embedding agents in the network, we call them beacons, is an effective means of fighting the spread of rumor, even if these beacons do not share information.
Conference Paper
A rumor is commonly defined as a statement whose true value is unverifiable. Rumors may spread misinformation (false information) or disinformation (deliberately false information) on a network of people. Identifying rumors is crucial in online social media where large amounts of information are easily spread across a large network by sources with unverified authority. In this paper, we address the problem of rumor detection in microblogs and explore the effectiveness of 3 categories of features: content-based, network-based, and microblog-specific memes for correctly identifying rumors. Moreover, we show how these features are also effective in identifying disinformers, users who endorse a rumor and further help it to spread. We perform our experiments on more than 10,000 manually annotated tweets collected from Twitter and show how our retrieval model achieves more than 0.95 in Mean Average Precision (MAP). Finally, we believe that our dataset is the first large-scale dataset on rumor detection. It can open new dimensions in analyzing online misinformation and other aspects of microblog conversations. 1
Article
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.
Chapter
This chapter introduces the concept of differential entropy, which is the entropy of a continuous random variable. Differential entropy is also related to the shortest description length, and is similar in many ways to the entropy of a discrete random variable. But there are some important differences, and there is need for some care in using the concept.