Article

The spread of true and false news online

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Lies spread faster than the truth There is worldwide concern over false news and the possibility that it can influence political, economic, and social well-being. To understand how false news spreads, Vosoughi et al. used a data set of rumor cascades on Twitter from 2006 to 2017. About 126,000 rumors were spread by ∼3 million people. False news reached more people than the truth; the top 1% of false news cascades diffused to between 1000 and 100,000 people, whereas the truth rarely diffused to more than 1000 people. Falsehood also diffused faster than the truth. The degree of novelty and the emotional reactions of recipients may be responsible for the differences observed. Science , this issue p. 1146

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Social media platforms are increasingly a medium where people, from ordinary citizens to famous celebrities and politicians, share their opinions and their lives, learn about current events, attempt to influence each other, and spread misinformation [e.g., (2)(3)(4)]. Individuals, governments, and companies are interested in what content causes readers to share posts because of the power of social media to influence public and consumer opinion [e.g., (5,6)]. One aspect of posts thought to cause message virality is emotional content (7,8), and the general presence of emotions can encourage message sharing (8)(9)(10). ...
... Similarly, when researchers manipulated the valence of framing of true or false political news in a Romanian sample, negative framing of false news increased both anger and fear (the only negative emotions they tested), which then increased message virality (41). In another study, anger was positively associated with sharing for tweets about climate change (which was generally negative) but negatively associated with sharing for tweets about same-sex marriage [which generally had positive emotion words; (6)]. The Facebook algorithm allegedly boosted the visibility of messages where users had clicked the anger emoji reaction in the hopes of generating more user activity (42). ...
... Future work should repeat these analyses, also with in-country/ in-language annotators using this or a broader array of emotions (e.g., including hope, curiosity, and boredom); in different countries and languages, around different kinds of events, and on additional topics [given previous findings suggesting differences by topic (6)]; and on platforms that may have different behind-thescenes algorithms (e.g., YouTube), which might boost different kinds of content. In knowing which models are most predictive, future scholars could use them to develop theory, test hypotheses, and compare effect sizes between a broader array of emotions. ...
Article
Full-text available
While emotional content predicts social media post sharing, competing theories of emotion imply different predictions about how emotional content will influence the virality of social media posts. We tested and compared these theoretical frameworks. Teams of annotators assessed more than 4000 multimedia posts from Polish and Lithuanian Facebook for more than 20 emotions. We found that, drawing on semantic space theory, modeling discrete emotions independently was superior to models examining valence (positive or negative), activation/ arousal (high or low), or clusters of emotions and was on par with but had more explanatory power than a seven basic emotion model. Certain discrete emotions were associated with post sharing, including both positive and negative and relatively lower and higher activation/arousal emotions (e.g., amusement, cute/kama muta, anger, and sadness) even when controlling for number of followers, time up, topic, and Facebook angry reactions. These results provide key insights into better understanding of social media post virality.
... Negative rumors are unverified information released by rumormongers to intentionally smear victims and attract public attention (Kamins et al., 1997;Choi and Seo, 2021). Massive amounts of information are produced, disseminated and consumed every day (Fedeli, 2020), creating space for negative tourism rumors to spread quickly and widely (Vosoughi et al., 2018). They may be browsed, followed or shared by users in different places within a short time and deteriorate into a real-time crisis, resulting in more negative comments and destination avoidance (Pal et al., 2020;Bethune et al., 2022;Wang et al., 2023). ...
... The topic of negative tourism rumors is mixed with rumors, derivative rumors and truth (Vosoughi et al., 2018). Once the corrective information is posted, users detect and eliminate false or ambiguous information. ...
... Second, UOCB is the response of positive emotion and cognition. The heat and spreading speed of rumors sometimes exceed the truth, which enhances the difficulty of users' judgement (Vosoughi et al., 2018). Once users perceive correction authenticity, they trust corrective content and support corrective behavior, thus showing UOCB such as making positive statements and refuting rumors. ...
Article
Full-text available
Purpose-Negative rumors damage the destination's image and tourist experience. This study aims to compare how rumor correction sources (government vs business vs tourist) affect user online citizenship behavior (UOCB). Design/methodology/approach-Based on the stimuli-organism-response framework, a hypothetical model was established from rumor correction to UOCB. Three scenario experiments (more than 1,000 valid samples) were designed. Study 1 illustrated the effects of different rumor corrections, Study 2 was designed to verify the mediating effects of sympathy and perceived information authenticity (PIA) and the robustness of results was demonstrated in Study 3. Findings-Government correction elicited the highest sympathy and PIA. Business correction was less than tourist correction in arousing sympathy but better than tourist correction in enhancing PIA. Sympathy and PIA had a mediating effect on the relationship between rumor correction and UOCB. Practical implications-This study helps to identify the different advantages of rumor correctors and provides insights to prevent the deterioration of negative tourism rumors or even reverse these crises. Originality/value-This study innovates research perspective of negative tourism rumor governance, expands the understanding of the effect and process of rumor correction and enriches the research content of tourism crisis communication.
... Social media platforms (e.g., Twitter) have become a medium for rapidly spreading rumors along with emerging events [2]. Those rumors may have a lasting effect on users' opinion even after it is debunked, and may continue influence them if not replaced with convincing evidence [3]. ...
... The figure shows an example tweet from each of the timelines of the authorities that actually supports the rumor. 2 In this work, we introduce the task of detecting the stance of authorities towards rumors in Twitter. Due to the lack of datasets for the task, we construct and release the first Authority STance towards Rumors (AuSTR) dataset (Section 4). ...
Preprint
Full-text available
Several studies examined the leverage of the stance in conversational threads or news articles as a signal for rumor verification. However, none of these studies leveraged the stance of trusted authorities. In this work, we define the task of detecting the stance of authorities towards rumors in Twitter, i.e., whether a tweet from an authority supports the rumor, denies it, or neither. We believe the task is useful to augment the sources of evidence exploited by existing rumor verification models. We construct and release the first Authority STance towards Rumors (AuSTR) dataset, where evidence is retrieved from authority timelines in Arabic Twitter. The collection comprises 811 (rumor tweet, authority tweet) pairs relevant to 292 unique rumors. Due to the relatively limited size of our dataset, we explore the adequacy of existing Arabic datasets of stance towards claims in training BERT-based models for our task, and the effect of augmenting AuSTR with those datasets. Our experiments show that, despite its limited size, a model trained solely on AuSTR with a class-balanced focus loss exhibits a comparable performance to the best studied combination of existing datasets augmented with AuSTR, achieving a performance of 0.84 macro-F1 and 0.78 F1 on debunking tweets. The results indicate that AuSTR can be sufficient for our task without the need for augmenting it with existing stance datasets. Finally, we conduct a thorough failure analysis to gain insights for the future directions on the task.
... Conspiracy theories also often function as a source of entertainment, with many people sharing them alongside jokes, memes, and playful trolling. In line with this, empirical evidence suggests that the conspiracy theories which spread most efficiently online are especially juicy, titillating, and entertaining (Vosoughi et al., 2018;Rosenblum & Muirhead, 2019, ch. 2;Van Prooijen et al., 2022). ...
Article
Full-text available
Some incredibly far-fetched conspiracy theories circulate online these days. For most of us, clear evidence would be required before we’d believe these extraordinary theories. Yet, conspiracists often cite evidence that seems transparently very weak. This is puzzling, since conspiracists often aren’t irrational people who are incapable of rationally processing evidence. I argue that existing accounts of conspiracist belief formation don’t fully address this puzzle. Then, drawing on both philosophical and empirical considerations, I propose a new explanation that appeals to the role of the imagination in conspiracist belief formation. I argue that conspiracists first become imaginatively absorbed in conspiracist narratives, where this helps to explain how they process their evidence. From there, we can better understand why they find this evidence so compelling, as well as the psychological role it plays in their belief forming processes. This account also has practical implications for combatting the spread of online conspiracy theories.
... These industries have harnessed strategies that primarily focus on boosting engagement and consumption. Due to their inherent design, social media platforms have been effective conduits for misinformation; algorithms that dictate content propagation often prioritise engagement, inadvertently promoting sensationalised and untrue content (Vosoughi et al., 2018). This issue has been particularly acute in political propaganda and health-related misinformation (Pennycook & Rand, 2018). ...
... Because of this barrier (that we understand as a cultural TRIM), groups' cultural identities are less likely to be compromised. According to Vosoughi et al. (2018), fact-checks would spread less quickly than fake news, making it harder for them to be truly effective (Van der Linden et al. 2020). ...
Article
Full-text available
In this paper, we critically consider the analogy between “infodemic” and “pandemic”, i.e. the spread of fake news about COVID-19 as a medial virus and the infection with the biological virus itself from the perspective of cultural evolutionary theory (CET). After confronting three major shortcomings of the ‘infodemic’ concept, we use CET as a background framework to analyze this phenomenon. To do so, we summarize which bi-ases are crucial for transmission in terms of cultural selection and how transmission is restricted by filter bubbles or echo chambers acting as TRIMS (transmission isolating mechanisms) post “infection”, which isolate false from trustworthy scientific information in the context of the Corona pandemic. This is followed by a demonstration of the threat to biological fitness posed by the effects of an infection with fake news, which leads to a reduced willingness to vaccinate and follow health measures. We identify fake news on Covid as pseudoscience, trying to immunize itself from external influences. We then address the question of how to combat the infodemic. Since debunking strategies, such as warnings by fact-checking, have proven relatively ineffective in combating fake news, the inoculation theory from psychology might offer an alternative solution. Through its underlying ‘prebunking strategy’, which educates individuals about the risks and tactics of fake news prior to a potential infection, they could be ‘immunized’ in advance, similar to a virological vaccination. Although we recognize that the pandemic/infodemic analogy is in fact far from perfect, we believe that CET could provide a theoretical underpinning in order to give much more semantic depth to the concept ‘infodemic’.
... On the one hand, donations mean more opportunities for societal well-being and advancement. On the other hand, accepting controversial donations can cause public outrage and long-term reputational damage (7)(8)(9). ...
Article
Full-text available
Philanthropy is essential to public goods such as education and research, arts and culture, and the provision of services to those in need. Providers of public goods commonly struggle with the dilemma of whether to accept donations from morally tainted donors. Ethicists also disagree on how to manage tainted donations. Forgoing such donations reduces opportunities for societal well-being and advancement; however, accepting them can damage institutional and individual reputations. Half of professional fundraisers have faced tainted donors, but only around a third of their institutions had relevant policies (n = 52). Here, we draw on two large samples of US laypeople (ns = 2,019; 2,566) and a unique sample of experts (professional fundraisers, n = 694) to provide empirical insights into various aspects of tainted donations that affect moral acceptability: the nature of the moral taint (criminal or morally ambiguous behavior), donation size, anonymity, and institution type. We find interesting patterns of convergence (rejecting criminal donations), divergence (professionals’ aversion to large tainted donations), and indifference (marginal role of anonymity) across the samples. Laypeople also applied slightly higher standards to universities and museums than to charities. Our results provide evidence of how complex moral trade-offs are resolved differentially, and can thus motivate and inform policy development for institutions dealing with controversial donors.
... Aaronovitch [2] defines conspiracy theories as 'the attribution of deliberate agency to something more likely to be accidental or unintended; therefore, it is the unnecessary assumption of conspiracy when other explanations are more probable. ' Due to the rapid spread of information across the internet, coupled with the alarming speed at which false information can proliferate [3], we find ourselves amidst what some have dubbed a "golden age" of conspiracy theories [4]. Being a distinct form of misinformation, conspiracy theories exhibit unique characteristics. ...
Conference Paper
Full-text available
Conspiracy theories have become a prominent and concerning aspect of online discourse, posing challenges to information integrity and societal trust. As such, we address conspiracy theory detection as proposed by the ACTI @ EVALITA 2023 shared task. The combination of pre-trained sentence Transformer models and data augmentation techniques enabled us to secure first place in the final leaderboard of both sub-tasks. Our methodology attained F1 scores of 85.71% in the binary classification and 91.23% for the fine-grained conspiracy topic classification, surpassing other competing systems.
... The proliferation of misinformation on social networks-information that is incorrect or misleading-has become an escalating concern in public debate and academic research in recent years [1][2][3][4][5]. In addition to studying the diffusion of misinformation and seeking potential interventions and factors that might remedy misinformation among individuals, researchers further reported that misinformation significantly alters the behaviour of individuals' decisionmaking [1,2,[5][6][7][8][9][10][11][12][13][14]. For instance, a randomized controlled trial shows that misinformation around COVID-19 vaccines induced a decline in the intent of 6.2 percentage points in the UK and 6.4 percentage points in the USA among those who stated that they would definitely accept a vaccine [4]. ...
Article
Full-text available
Human societies are organized and developed through collective cooperative behaviours. Based on the information in their environment, individuals can form collective cooperation by strategically changing unfavourable surroundings and imitating superior behaviours. However, facing the rampant proliferation and spreading of misinformation, we still lack systematic investigations into the impact of misinformation on the evolution of collective cooperation. Here, we study this problem by classical evolutionary game theory. We find that the existence of misinformation generally impedes the emergence of collective cooperation on networks, although the level of cooperation is slightly higher for weak social cooperative dilemma below a proven threshold. We further show that this possible advantage diminishes as social connections become denser, suggesting that the detrimental effect of misinformation further increases when ‘social viscosity’ is low. Our results uncover the quantitative effect of misinformation on suppressing collective cooperation, and pave the way for designing possible mechanisms to improve collective cooperation.
... Detection algorithms have the crucial task of implementing technical approaches in service provisioning and methodologies to aggregate a variety of information in order to infer the reliability of news the user wishes to interact with. However, debunking is a difficult task and has to overcome several challenges: aside from the size of published fake news to be verified, corrective information can sometimes provoke a so-called "backfire effect" in which respondents more strongly endorse a misperception about a controversial political or scientific issue when their beliefs or predispositions is challenged [2]; finally, debunks do not reach as many people as fake news, and they do not spread nearly as quickly [3]. ...
Article
Full-text available
Technological development combined with the evolution of the Internet has made it possible to reach an increasing number of people over the years and given them the opportunity to access information published on the network. The growth in the number of fake news generated daily, combined with the simplicity with which it is possible to share them, has created such a large phenomenon that it has become immediately uncontrollable. Furthermore, the quality with which malicious content is made is increasingly high so even professional experts, such as journalists, have difficulty recognizing which news is fake and which is real. This paper aims to implement an architecture that provides a service to final users that assures the reliability of news providers and the quality of news based on innovative tools. The proposed models take advantage of several Machine Learning approaches for fake news detection tasks and take into account well-known attacks on trust. Finally, the implemented architecture is tested with a well-known dataset and shows how the proposed models can effectively identify fake news and isolate malicious sources.
... Este trabajo encuentra que la herramienta virtual foro se asocia con el aprendizaje en investigación educativa; Koranteng, et al. (2019) considera que el logro de conocimientos no necesariamente impacte en el compromiso de emplear la tecnología sino que también podría conllevar a la generación de falsos conocimientos que podrían ser difundidos y hasta aceptados, por grupos sociales y académicos (O'Keeffe et al., 2011;Vosoughi, Roy, & Aral, 2018); de este modo, el aprendizaje se hace colaborativo, y se concreta en la interacción socioeducativa de la investigación al momento que los estudiantes intercambian experiencias e información académica. ...
Article
Full-text available
El estudio tiene como objetivo determinar la relación entre el aula virtual y el aprendizaje en investigación educativa en estudiantes universitarios. Respecto al método e instrumentos se basa en un estudio cuantitativo no experimental, nivel correlacional de corte transversal, a los que se aplicaron encuestas a 120 estudiantes universitarios de tres escuelas académico profesionales de educación de una universidad pública. Los resultados evidencian una correlación de Tau b de kendall (0.807) alta y significativa entre el aula virtual y el aprendizaje en investigación, lo que indica que los estudiantes universitarios se adaptaron a la educación virtual de la enseñanza-aprendizaje asumiendo estrategias sincrónicas y asincrónicas con las que desarrollaron aprendizajes en investigación educativa en beneficio de su formación profesional. Se concluye que el aula virtual mediante las herramientas videoconferencia, chat y foro han permitido que los maestros puedan interactuar con los estudiantes universitarios mediante una educación virtualizada, hasta lograr aprendizajes significativos, activos, colaborativos, se realizó investigación educativa sin perder la esencia de la educación.
... Previous research on public scientific material relating to the COVID-19 pandemic, on the other hand, has primarily concentrated on qualitative analysis, ignoring internet audience reactions (Bucher et al. 2021). While datasets available for social media communication about Corona are available, it is a considerable challenge to filter this volume of information, being shown that social media information propagation during the pandemic has largely shone a focus around general trends, political attitudes, and the spread of misinformation, which surpasses true information in spread (Vosoughi et al. 2018). ...
Article
Full-text available
Social media platforms that disseminate scientific information to the public during the COVID-19 pandemic highlighted the importance of the topic of scientific communication. Content creators in the field, as well as researchers who study the impact of scientific information online, are interested in how people react to these information resources. This study aims to devise a framework that can sift through large social media datasets and find specific feedback to content delivery, enabling scientific content creators to gain insights into how the public perceives scientific information, and how their behavior toward science communication (e.g., through videos or texts) is related to their information-seeking behavior. To collect public reactions to scientific information, the study focused on Twitter users who are doctors, researchers, science communicators, or representatives of research institutes, and processed their replies for two years from the start of the pandemic. The study aimed in developing a solution powered by topic modeling enhanced by manual validation and other machine learning techniques, such as word embeddings, that is capable of filtering massive social media datasets in search of documents related to reactions to scientific communication. The architecture developed in this paper can be replicated for finding any documents related to niche topics in social media data.
... Such continuous exposure to ambiguous or misleading content can gradually erode trust in traditionally reliable sources, leaving people unsure about where to seek genuine information. [92] The surge in AI-driven misinformation can make individuals feel obligated to authenticate every piece of data they come across, potentially adding to feelings of being overwhelmed and the relentless pressure to recheck the facts. [93] Misinformation, when broadly accepted, can also place individuals in conflict with peers or community members, fostering anxiety rooted in potential isolation or the uphill battle of de-bunking widely held, yet inaccurate, beliefs. ...
Article
Full-text available
The rapid advancement of artificial intelligence (AI) and flexible AI systems, like generative AI, has sparked concern over its potential psychological impact, leading to an urgent need for a multidisci- plinary approach to understand and mitigate AI-induced anxiety. This article conducts a thorough investigation of the unique stressors of AI anxiety present in different age demographics including young adults, middle-aged, and seniors. It also sheds light on common fundamental causes of AI anxiety such as concerns about privacy, fake AI generated content, uncontrolled AI growth, and bias in AI. While this article confines AI anxiety to non-clinical parameters, it explores its far-reaching impacts on mental and physical health, along with repercussions on healthcare expenses, economic efficiency, and fertility. To tackle AI anxiety, multidisciplinary solutions are proposed. First, the paper highlights the urgent need for AI-adapted clinical guidelines, incorporating tailored diagnostic and therapeutic measures aimed at mitigating AI anxiety. It also emphasizes the role of continuous education in demystifying AI, advocating for prompt updates to school curricula to match the rapid progress in AI technology. Emphasis is given to promoting regulatory measures to balance AI innovation and societal adaptability, including engineering constraints and ethical design as essential components of responsible AI development. Lastly, it explores pioneering solutions to mitigate AI anxiety, such as leveraging AI in mental health interventions. Heightened focus and allocation of resources are critical steps to confront the escalating mental health challenges as society navigates the impending era of pervasive AI.
... Catchy titles alone can lead to thousands of instant shares without verifying sources or information. This has caused two significant issues: a surge in data volume and a shortage of tools to discern credible information from untrue, giving rise to fake news [9], false news [59], satire news [19], disinformation [39], misinformation [40], and rumors [24]. False information serves various purposes, like increasing website or social media traffic and influencing public opinion on political decisions and financial transactions. ...
Article
Full-text available
Social media has become an integral part of people’s lives, resulting in a constant flow of information. However, a concerning trend has emerged with the rapid spread of fake news, attributed to the lack of verification mechanisms. Fake news has far-reaching consequences, influencing public opinion, disrupting democracy, fuelingsocial tensions, and impacting various domains such as health, environment, and the economy. In order to identify fake news with data sparsity, especially with low resources languages such as Arabic and its dialects, we propose a few-shot learning fake news detection model based on sentence transformer fine-tuning, utilizing no crafted prompts and language model with few parameters. The experimental results prove that the proposed method can achieve higher performances with fewer news samples. This approach provided 71% F1 score on the Algerian dialect fake news dataset and 70% F1 score on the Modern Standard Arabic (MSA) version of the same dataset, which proves that the approach can work on the standard Arabic and its dialects. Therefore, the proposed model can identify fake news in several domains concerning the Algerian community such as politics, COVID-19, tourism, e-commerce, sport, accidents, and car prices.
... In a different task, over half of students believed an anonymously posted video purporting to show voter fraud in the US was real, even though it was filmed in Russia (Breakstone et al., 2021). In another study, Vosoughi et al. (2018) examined approximately 126,000 stories tweeted over 4.5 million times between 2006 and 2017. The researchers were interested in understanding what accounted for the differential diffusion of verified true and false news stories. ...
... A esto se suma la evidencia de que las noticias falsas se difunden más y más rápido que las verdaderas (Vosoughi et al., 2018) y no resulta sencillo medir el impacto y el alcance de las verificaciones. De hecho, desde la academia son muchos los ISSN: 2660-4213 Narrativas digitales contra la desinformación. ...
Article
Full-text available
En este capítulo ofrecemos una aproximación descriptiva a la figura de los fact-chekcers o verificadores de datos, definiéndolos como personas o equipos que se encargan de comprobar la veracidad de informaciones publicadas en medios de comunicación o difundidas en Internet, con el objetivo de desmentir las noticias falsas y asegurar la fiabilidad de la información. Profundizamos en su valor como promotores de la verdad en el discurso público y garantes de la transparencia en el marco de los sistemas democráticos y realizamos un breve repaso histórico de su nacimiento, evolución y extensión a nivel mundial. Clasificamos estas organizaciones en función de su modo de financiación, su alcance geográfico y su alcance temático y describimos su proceso de trabajo, basado en la detección, análisis y difusión de informes sobre la veracidad de los contenidos. Por último, nos asomamos a los desafíos que estas entidades deben enfrentar en el ejercicio de su labor, como la falta de fuentes confiables, la limitación de tiempo para abordar cantidades cada vez mayores de desinformación o el desconocimiento y la desconfianza de los usuarios.
... La plataforma de microblogging, con 436 millones de usuarios, ocupa un modesto decimoquinto puesto en el ranking de redes sociales y de mensajería más utilizadas en el mundo (We are social; Hootsuite, 2022). Sin embargo, es la red favorita de periodistas y medios de comunicación y está muy vinculada con los procesos informativos desde sus orígenes (Míguez-González et al., 2023); además, algunas investigaciones apuntan a que en esta plataforma la información falsa, especialmente si trata sobre temas políticos, se comparte a mayor velocidad que en otras redes (Vargo et al., 2018;Vosoughi et al., 2018). ...
Article
Full-text available
Este capítulo tiene por objeto describir las estrategias de distribución de contenido que los fact-checkers iberoamericanos de la IFCN desarrollan a través de las redes sociales, atendiendo a la intencionalidad de sus publicaciones, sus temáticas, sus características formales y narrativas y su repercusión. Twitter, Facebook, YouTube e Instagram son las cuatro redes sociales más utilizadas por los fact-checkers; desde 2020 estas entidades han incrementado su actividad en estas redes, obteniendo un incremento de seguidores y mayor difusión de sus contenidos; a pesar de ello, los resultados de los fact-checkers en términos de repercusión e interacción son irregulares y su impacto sigue siendo modesto. La mayoría de los contenidos comunicativos que estas entidades difunden en las redes sociales son verificaciones, es decir, desmentidos de noticias falsas o verificaciones positivas, pero también publican contenido y, en menor medida, contenido de alfabetización o de autopromoción. Desde el punto de vista temático, los dos ejes fundamentales en torno a los que giran los contenidos producidos por los fact-checkers son la política y la salud, con una actividad muy marcada entre 2020 y 2022 por la pandemia de la COVID-19.
... Según el profesor del MIT, Sinan Aral (2020), las desinformaciones sobre política son aquellas con un índice más alto de viralización, se propagan con mayor profundidad y se difunden más ampliamente. Y, dentro de esta tipología de informaciones falsas, tienen un mayor impacto aquellas «(...) sobre terrorismo, desastres naturales, ciencia, leyendas urbanas, o información financiera» (Vosoughi; Aral, 2018Aral, : 1146. ...
Article
Full-text available
Los orígenes de la desinformación se remontan al inicio de la humanidad puesto que nuestra competencia narrativa innata y la necesidad de adaptar los medios del momento a nuestras necesidades comunicativas han conllevado la creación de relatos alternativos que permitan gestionar nuestra incapacidad de vivir en primera persona la mayoría de acontecimientos.Si bien ha sido en el ámbito militar y geoestratégico donde los ejemplos documentados de desinformación eran más frecuentes, en la actualidad su normalización estratégica hace que tengamos que establecer una arqueología de los conceptos para entender su recorrido histórico, pero sobre todo sus ramificaciones.Frente a la idea de que la información es poder, la falta de información se presenta en la actualidad como un estado previo necesario para poder influir en los otros y desarrollar estrategias de polarización y división.En la última década, junto a los términos «fake news», «posverdad», «hechos alternativos», «desinformación», «fact-checking» o «polarización» han emergido otra serie de conceptos que han intentado combinar la tradición de cada disciplina con el ángulo y enfoque desde el que se analiza un fenómeno tan complejo como éste. Amenazas híbridas, guerra cognitiva, burbujas informativas, cámaras de eco, manipulación informativa, astroturfing, etc. son quizá los más representativos.
... Moreover, in cultural evolution theory predicts that cultural maladaptation may often result from the same social learning mechanisms that also create adaptive outcomes [19]. For example, attention-grabbing traits, such as inflammatory misinformation, often spread more rapidly and more broadly than factual information [20], precisely because human social learning mechanisms favour them. Such traits are be desirable for society and could be harmful for those who adopt them. ...
Article
Full-text available
It has been proposed that climate adaptation research can benefit from an evolutionary approach. But related empirical research is lacking. We advance the evolutionary study of climate adaptation with two case studies from contemporary United States agriculture. First, we define ‘cultural adaptation to climate change’ as a mechanistic process of population-level cultural change. We argue this definition enables rigorous comparisons, yields testable hypotheses from mathematical theory and distinguishes adaptive change, non-adaptive change and desirable policy outcomes. Next, we develop an operational approach to identify ‘cultural adaptation to climate change’ based on established empirical criteria. We apply this approach to data on crop choices and the use of cover crops between 2008 and 2021 from the United States. We find evidence that crop choices are adapting to local trends in two separate climate variables in some regions of the USA. But evidence suggests that cover cropping may be adapting more to the economic environment than climatic conditions. Further research is needed to characterize the process of cultural adaptation, particularly the routes and mechanisms of cultural transmission. Furthermore, climate adaptation policy could benefit from research on factors that differentiate regions exhibiting adaptive trends in crop choice from those that do not. This article is part of the theme issue ‘Climate change adaptation needs a science of culture’.
... Scholars have warned of rising levels of misinformation and disinformation, political polarization, erosion of democratic institutions, global authoritarianism, and decreasing faith in democracy. 1,2,3,4,5,6,7,8,9 Tackling these threats will require the concerted efforts of many institutions, including K-12 school systems. Although young people learn about democracy and ways to engage in civic life through interactions in their families, peer groups, communities, out-of-school-time activities, and other contexts, 10,11 K-12 schools play a key role because of the large number of students they serve and their formal responsibility of educating for democracy. ...
... The utilisation of Twitter as a valuable source of information has already yielded successful outcomes in the examination of political discussions (Majó-Vázquez et al., 2021;Stella et al., 2018;Bastos and Mercer, 2017;Majó-Vázquez et al., 2017), the analysis of networked citizen politics and political participation (Peña-López et al., 2014), the exploration of fake news distribution, exposure and engagement (Grimberg et al., 2019;Vosoughi et al, 2018), the identification of influential actors in disinformation campaigns (Smith et al., 2021) and the investigation of the dissemination of political content beyond text, such as images (Martínez-Rolán and Piñeiro-Otero, 2016) and links (Gabielkov et al., 2016). Considering this background, we deemed it appropriate to design our research based on a case study methodology, given that our primary research questions revolve around the utilisation of information, we lack control over behavioural events and our focus centres on a contemporary phenomenon (Yin, 2018, p. 32 ff.). ...
Article
Full-text available
This study examined the use of data from a transparency portal in media coverage of a high-profile case of alleged public procurement irregularities in Spain. Access to Twitter API was used to identify relevant URLs related to this issue. It was found that direct links to a portal were of low relevance, and most of the linked documents did not even mention the availability of data from a portal. Qualitative analysis revealed that the most frequent topics were the use of portal data as an authoritative argument to endorse information, the statement that the portal did not contain sufficient information for journalistic purposes, and the absence of data on third parties involved in public procurement. It is recommended that governments promote the existence of Portals and make media outlets aware that providing links to original data is beneficial for their reporting. In addition, linked open data should be used to ensure accuracy and transparency.
... A ação de bots na disseminação de desinformação foi analisada cuidadosamente por Vosoughi e colaboradores (Vosoughi et. al., 2018). Usado em busca no conjunto completo do Twitter eles levantaram um conjunto de cerca de 126 mil tuítes contendo desinformação associada ou revistas por agências de checagem sediadas nos Estados Unidos. Neste trabalho, são analisados o tempo de propagação, número de usuários atingidos e profundidade (links e nós) que as postagens percorr ...
Article
Full-text available
Neste texto, discutimos o uso de bots e de inteligência artificial (IA) para combater o fenômeno das fake news e desinformação no contexto da pandemia da Covid-19. Para tanto, selecionamos conteúdos sobre vacinas checados e divulgados por três agências de fact-checking brasileiras, bem como conteúdos sobre os imunizantes no Twitter. Um bot em código Python mediu a relação e o alcance desses conteúdos, avaliando possíveis impactos no contexto social complexo brasileiro, no mês de maio de 2021. Percebe-se que o uso de IA pode reduzir os impactos das fake news no ecossistema midiático. Destacamos a importância da checagem das informações e da necessidade de que ela tenha alcance e rapidez semelhante ao da disseminação das fake news para salvar vidas humanas prevenindo pela comunicação.
... Ex-Google design ethicist Tristan Harris has explained that Facebook triggers our base impulses with clever user interface design such as notifications and "Like" buttons (Bosker, 2016). Confessions made by the designers is also supported by the research done on the topic -social media is a platform where negative content is distributed farther and faster (Vosoughi et al., 2018). Algorithms that are built on incentive structures and social cues amplify the anger of users on social media platforms (Fisher & Taub, 2018). ...
Article
Full-text available
Facebook inc. (now Meta Platforms) has been a target of several accusations regarding privacy issues, dark pattern design, spreading of disinformation and polarizing its users. Based on several leaked documents, the company’s public relations have often contradicted with its internal discussions and research. This study examines these issues by analyzing the leaked documents and published news articles. It outlines the dark patterns that the company has applied to their platform’s functionality, and discusses how they promote toxic behavior, hate speech and disinformation to flourish on the platform. The study also discusses some of the discrepancies between Facebook inc.’s public relations and internal work culture and discussions.
... This suggests that individuals are most willing to communicate about climate change when they are receiving information from these sources. The credibility of the broadcast source is also an important factor in determining the success of climate change news, as noted in studies by Vosoughi et al. (2018), Pornpitakpan (2004), and Betsch and Sachse (2013). Faced with complex phenomena such as global warming, people often do not have enough time or cognitive ability to fully understand related issues, so in most cases, people rely on reliable sources of information to help them understand the issues and make judgments. ...
Article
Full-text available
Climate change communication is an important behavioral manifestation of the public’s understanding, expression, and participation in addressing climate change. Social media play an important role in the climate change knowledge communication. Does social media promote climate change communication behavior in the Chinese context? Is its effect stronger than other types of media? Combined with the research context, we divide media into central media, local media, and social media and construct the influence mechanism model of media use on climate change communication behavior. In this study, a questionnaire survey was conducted among the public in China, and 1062 valid questionnaires were empirically tested by methods of hierarchical regression and bootstrapping. According to the findings of the study, different media use has a positive effect on climate change communication behavior. While social media is more likely to be used by the public to obtain climate change-related information than central and local media (with a mean value of 3.84 for social media compared to 3.51 for central media and 3.19 for local media), it is actually the central media that have the greatest effect on climate change communication behavior. This is evident in the total effect value, where the central media have a value of 0.21, which is higher than social media’s value of 0.20 and local media’s value of 0.12. Risk perception and environmental values play an important mediating role in the influence of media use on climate change communication behavior, among which environmental values have the largest mediating effect. (Specifically, the mediating effects of environmental values were 26.83%, 31.28%, and 38.57% for central media, local media, and social media, respectively.) In addition, risk perception can also positively affect environmental values, thus forming a chain mediating effect between media use and climate change communication behavior (the confidence intervals for the chain mediating effect also exclude the numbers 0).
... Another example comes from Corneille et al. (2020; see also Béna et al., 2022), who found that repeated statements were more likely to be perceived as "previously used as fake news on social media" than new ones. One possible reason is that when the judgment refers to social media, a context where fabricated and false information may spread widely (e.g., Del Vicario et al., 2016;Vosoughi et al., 2018;Juul & Ugander, 2021), repetition-induced fluency may be a cue for "fake news" judgments rather than for truth. Further investigations, however, are required to better CLICKBAIT EXPOSURE AND TRUTH JUDGMENTS 5 understand this "fake news by repetition" effect because of interpretational issues with the instructions that were used and because truth effect studies mimicking social media postings found the typical effect (Nadarevic et al., 2020;Smelter & Calvillo, 2020; for a discussion, see Béna et al., 2022). ...
Article
Full-text available
In two high‐powered experiments, we investigated how prior exposure to statements presented in a clickbait format increases the perceived truth of their content. In Experiment 1 ( N = 241), we hypothesized and found that prior exposure increased the proportion of “true” judgments for both non‐clickbait and clickbait content, but with a reduced effect of prior exposure for statements originally presented in a clickbait format. In Experiment 2 ( N = 291), turning to continuous ratings, we found higher truth ratings for repeated than new clickbait statements, even when repetition evidently originated from prior exposure to clickbait statements. The present findings suggest that exposure to clickbait headlines can increase their content's truth judgments despite their overall lack of credibility, although to a lesser extent than for more regular statements. The present research additionally supports an implausibility account rather than a source memory account of the truth effect with clickbait statements.
... The discrepancy between our findings and this evidence for positivity bias in "fake news" is puzzling, but we think it is due to differences in granularity and methodology. Pröllochs et al. worked with unique news-related rumors spreading through a general Twitter population (Pröllochs et al., 2021;Vosoughi et al., 2018) whereas we analyzed sub-conversations about the same topic within a single conspiracy theory community. Content biases for emotional valence may vary when messages are shared between like-minded individuals or embedded within a single conversation. ...
Article
Full-text available
During the 2020 US presidential election, conspiracy theories about large-scale voter fraud were widely circulated on social media platforms. Given their scale, persistence, and impact, it is critically important to understand the mechanisms that caused these theories to spread. The aim of this preregistered study was to investigate whether retweet frequencies among proponents of voter fraud conspiracy theories on Twitter during the 2020 US election are consistent with frequency bias and/or content bias. To do this, we conducted generative inference using an agent-based model of cultural transmission on Twitter and the VoterFraud2020 dataset. The results show that the observed retweet distribution is consistent with a strong content bias causing users to preferentially retweet tweets with negative emotional valence. Frequency information appears to be largely irrelevant to future retweet count. Follower count strongly predicts retweet count in a simpler linear model but does not appear to drive the overall retweet distribution after temporal dynamics are accounted for. Future studies could apply our methodology in a comparative framework to assess whether content bias for emotional valence in conspiracy theory messages differs from other forms of information on social media.
... En ese sentido las redes socio digitales han estado en el ojo público no solo como tecnologías sino como agentes corporativos que facilitan, por comisión u omisión, operaciones tanto espontáneas como calculadas de desinformación masiva; en el fondo el debate se centra en si la arquitectura discursiva con la que están diseñadas dichas plataformas permite e incluso incentiva estás prácticas, y si sus arquitectos han tomado las medidas necesarias para desescalar tales operaciones (Allcott et al., 2019;Vosoughi et al., 2018). ...
Article
Full-text available
De ser eficaces, las operaciones de desinformación en redes sociales pudieran generar creencias falsas en el público y llevarle a tomar decisiones políticas contrarias a sus intereses. En respuesta a ello, distintos actores han implementado medidas entre las que destacan la (auto) regulación de las plataformas para señalizar información maliciosa y acciones públicas de alfabetización informacional. Con todo, aún no se dispone de evidencia empírica suficiente que dé cuenta de la medida en que la información falsa genera creencias falsas, así como la eficacia de las citadas acciones para reducir la credulidad de los ciudadanos. El presente estudio se basa en una encuesta representativa aplicada durante las elecciones de 2021 en México (n=1750), con el objetivo de analizar la influencia del uso de redes sociales para seguir la campaña electoral en el nivel de credulidad de los usuarios hacia la información falsa. Al mismo tiempo observamos cómo estos efectos están moderados por un lado por las distintas plataformas, cuya arquitectura es más o menos propensa a alertar sobre desinformación, y por otro, por actitudes y estrategias de ciudadanos más o menos alfabetizados para prevenir ser desinformados. Encontramos que el uso de plataformas que han empleado medidas para alertar sobre desinformación no predice una mayor o menor credulidad, pero en cambio la red WhatsApp, con menos controles y un carácter más privado, sí la incrementa. Por otro lado, el ejercicio de estrategias no reduce la credulidad en la desinformación, a diferencia de las actitudes en contra, que la disminuye sustantivamente.
Article
Full-text available
The widespread use of Artificial Intelligence (AI) has transformed civilization and greatly affected human behavior and well-being. This empirical study examines and quantifies AI's various effects on behavior and well-being. This study examines the complex interaction between AI and human behavior across domains using a comprehensive literature review and a variety of empirical data sources. It examines how AI-driven personalisation, recommendation systems, and content curation affect people's preferences and interactions. AI's impact on healthcare, education, and mental health is also examined. The empirical investigation also covers AI's ethical and societal ramifications, including data privacy, algorithmic biases, and the psychological effects of AI-driven social media platforms. It quantifies AI's impact on job markets and economic behaviors, revealing labor force prospects and difficulties. This study also examines how AI improves healthcare, education, and convenience. This research seeks to understand how AI is changing human behavior and well-being through rigorous statistical analysis and data-driven investigation. The findings can help students, professionals, and society safely and ethically navigate AI technologies. This study emphasizes the need for a balanced strategy to exploit AI's benefits while minimizing its potential harm to individuals and societies. The main aim of the research is to identify&analysethe variables related to artificial intelligence which impacts on human behaviour& well-being.
Chapter
After a particularly difficult period of the pandemic, which disrupted the sense of security, both economic and social, as well as psychological one, Poles have been exposed to yet another difficult test. The war in Ukraine and its direct social, political and cultural impacts on Poland broke the fragile post-pandemic set-up and deprived people of prospects to return to a settled and predictable reality. At the same time, it was the reason some people have focused on Internet-driven conspiracy theories—conspiracy theories, defined as explanatory beliefs about a group of actors that collude in secret to reach malevolent goals (Bale in Patterns of Prejudice 41(1), 45–60, 2007; van Prooijen & Douglas in European Journal of Social Psychology 48(7), 897–908, 2018). During crises, people are prone to support conspiracy theories (Bavel et al. in Nature Human Behaviour 4, 460–471, 2020; van Prooijen & Douglas in Memory Studies 10(3), 323–333, 2017; van Prooijen & Douglas in European Journal of Social Psychology 48(7), 897–908, 2018). They seem to be efficient in situations that we find hard to explain and offer plausible solutions (Oleksy et al. in Personality and Individual Differences 168, 110289, 2021). The theories provide illusive perspectives and false knowledge to silence our uncertainty. The article attempts to analyse social and psychological mobilization mechanisms used in the conspiracy-type content on Twitter during the first phase of the war in Ukraine. Research proves that disinformation on Twitter can spread much more widely than true information, depending on the depth of the tweet cascades, the users involved in sharing and the time of sharing. The study is both quantitative (number of accounts, followers, comments, etc.) and qualitative in its attempt to point out mobilization mechanisms used to attract supporters of conspiracy theories.
Chapter
The recent controversy over ‘fake news' reminds us of one of the main problems on the web today: the utilization of social media and other outlets to distribute false and misleading content. The impact of this problem is very significant. This article discusses the issue of fake content on the web. First, it defines the problem and shows that, in many cases, it is surprisingly hard to establish when a piece of news is untrue. It distinguishes the issue of fake content from issues of hate/offensive speech (while there is a relation, the issues involved are a bit different). It then overviews proposed solutions to the problem of fake content detection, both algorithmic and human. On the algorithmic side, it focuses on work on classifiers. The chapter shows that most algorithmic approaches have significant issues, which has led to reliance on the human approach in spite of its known limitations (subjectivity, difficulty to scale). Finally, it closes with a discussion of potential future work.
Article
From a socio-theoretical and media-theoretical perspective, this article analyses exemplary practices and structural characteristics of contemporary digital political campaigning to illustrate a transformation of the public sphere through the platform economy. The article first examines Cambridge Analytica and reconstructs its operational procedure, which, far from involving exceptionally new digital campaign practices, turns out to be quite standard. It then evaluates the role of Facebook as an enabling ‘affective infrastructure’, technologically orchestrating processes of political opinion-formation. Of special concern are various tactics of ‘feedback propaganda’ and algorithmic-based user engagement that reflect, at a more theoretical level, the merging of surveillance-capitalist commercialization with a cybernetic logic of communication. The article proposes that this techno-economic dynamic reflects a continuation of the structural transformation of the public sphere. What Jürgen Habermas had analysed in terms of an economic fabrication of the public sphere in the 1960s is now advancing in a more radical form, and on a more programmatic basis, through the algorithmic architecture of social media. As the authors argue, this process will eventually lead to a new form of ‘infrastructural power’.
Article
Social media, in general, and Facebook in particular, have been clearly identified as important platforms for the dissemination of mis- and disinformation and related problematic content. However, the patterns and processes of such dissemination are still not sufficiently understood. We detail a novel computational methodology that focusses on the identification of high-profile vectors of “fake news” and other problematic information in public Facebook spaces. The method enables examination of networks of content sharing that emerge between public pages and groups, and external sources, and the study of longitudinal dynamics of these networks as interests and allegiances shift and new developments (such as the COVID-19 pandemic or the US presidential elections) drive the emergence or decline of dominant themes. Through a case study of content captured between 2016 and 2021, we demonstrate how this methodology allows the development of a new and more comprehensive picture of the overall impact of “fake news,” in all its forms, on contemporary societies.
Article
Healthy news consumption requires limited exposure to unreliable content and ideological diversity in the sources consumed. There are two challenges to this normative expectation: the prevalence of unreliable content online; and the prominence of misinformation within individual news diets. Here, we assess these challenges using an observational panel tracking the browsing behavior of N ≈ 140,000 individuals in the United States for 12 months (January–December 2018). Our results show that panelists who are exposed to misinformation consume more reliable news and from a more ideologically diverse range of sources. In other words, exposure to unreliable content is higher among the better informed. This association persists after we control for partisan leaning and consider inter- and intra-person variation. These findings highlight the tension between the positive and negative consequences of increased exposure to news content online.
Article
Recent studies have documented the type of content that is most likely to spread widely, or go “viral,” on social media, yet little is known about people’s perceptions of what goes viral or what should go viral. This is critical to understand because there is widespread debate about how to improve or regulate social media algorithms. We recruited a sample of participants that is nationally representative of the U.S. population (according to age, gender, and race/ethnicity) and surveyed them about their perceptions of social media virality ( n = 511). In line with prior research, people believe that divisive content, moral outrage, negative content, high-arousal content, and misinformation are all likely to go viral online. However, they reported that this type of content should not go viral on social media. Instead, people reported that many forms of positive content—such as accurate content, nuanced content, and educational content—are not likely to go viral even though they think this content should go viral. These perceptions were shared among most participants and were only weakly related to political orientation, social media usage, and demographic variables. In sum, there is broad consensus around the type of content people think social media platforms should and should not amplify, which can help inform solutions for improving social media.
Chapter
Digital libraries have focused on change to images from the perspectives of prevention and reversal. Since change is a required component of scholarship, we seek to adding the modeling of change to support its characterization. In this paper we discuss change to images in traditional media and propose a formal model of that change. The subject calls for a kaleidoscopic approach as tracking changes in images is an interesting exercise in storytelling, both when one looks at deliberately changing them with a purpose and at tracking past changes.
Article
Öz Spor medyasının başlıca sorunlarından biri olarak değerlendirilen asparagas haber, gerçeklik ile ilişkisi olmayan haberler için gazetecilik literatüründe kullanılan bir terimdir. Spor gazeteciliğinin mesleğin normatif ve katı standartlarının dışına çıkmaya ve etik ihlallere daha meyilli olduğu ileri sürülmektedir. Ayrıca gazetecilikte yaşanan dijital dönüşüm ile spor gazeteciliğinde yaşanan asparagas haber sorununun arttığı düşünülmektedir. Gazetecilik üzerine yoğun araştırmalar olmasına karşın bu sorun ile ilişkili sınırlı sayıda inceleme bulunmaktadır. Dolayısıyla bu araştırma sosyal medyayı ve özellikle Twitter’ı aktif olarak kullanan spor muhabirlerinin sosyal medyaya bakışları üzerinden asparagas haberlerin nedenlerini incelemeyi amaçlamıştır. Bu bağlamda, Twitter’da yüksek takipçiye sahip spor muhabirlerinden oluşturulan bir örneklem ile yarı yapılandırılmış görüşme yöntemi kullanılarak veri toplanmış, elde edilen veriler Profesyonel Veri Analizi Yazılımı aracılığıyla analiz edilmiştir. Bulgulara göre, katılımcılar, profesyonel gazetecilik standartlarını yansıtarak doğruluk, güvenilirlik ve tarafsızlığı vurgulamışlar, sosyal medya ve Twitter’ın gazetecilerin kendi tanıtımlarını yapmalarını mümkün kıldığını, böylece Twitter’ın gazetecilik için uygun bir ortam olduğunu ifade etmişlerdir. Öte yandan katılımcıların sosyal medya ve Twitter’a olumsuz yaklaşımları olumlu yaklaşımlarından daha fazladır. Katılımcılar, haber akışının hızlanması, herkesin muhabir olması, popülizmin ve sahtekarlığın artması gibi problemlere işaret etmişlerdir. Katılımcılara göre asparagas haberin yaygınlaşmasının en önemli nedeni izlenme ve etkileşim kaygısıdır, ayrıca okuyucular arasında da asparagas habere ciddi bir talep bulunmaktadır. Anahtar Kelimeler: Dijital Gazetecilik, Spor Gazeteciliği, Sosyal Medya, Twitter, Asparagas Haber
Conference Paper
The dissemination of fake news intensified during the COVID-19 pandemic. Thus, in this paper we propose a process-based model grounded on the diffusion of innovations theory to investigate how fake news was shared via social media. Accordingly, data was categorized according to the taxonomy of process, namely: input, output, players and activities. We found that individual decision-making traits, fake news perceived features and communication channel attributes are determinants of a personal attitude vis-à-vis a received fake news. The attitude then leads to the decision of whether or not to believe in the fake news, while confirmation conveys to the spread of same. In this way, social media, mediated by the power of algorithms and echo chambers, increase the chance of spreading misinformation.
Article
This article analyses various training methods and neural network tools for fake news detection. Approaches to fake news detection based on textual, visual and mixed data are considered, as well as the use of different types of neural networks, such as recurrent neural networks, convolutional neural networks, deep neural networks, generative adversarial networks and others. Also considered are supervised and unsupervised learning methods such as autoencoding neural networks and deep variational autoencoding neural networks. Based on the analysed studies, attention is drawn to the problems associated with limitations in the volume and quality of data, as well as the lack of efficiency of tools for detecting complex types of fakes. The author analyses neural network-based applications and tools and draws conclusions about their effectiveness and suitability for different types of data and fake detection tasks. The study found that machine and deep learning models, as well as adversarial learning methods and special tools for detecting fake media, are effective in detecting fakes. However, the effectiveness and accuracy of these methods and tools can be affected by factors such as data quality, methods used for training and evaluation, and the complexity of the fake media being detected. Based on the analysis of training methods and neural network characteristics, the advantages and disadvantages of fake news detection are identified. Ongoing research and development in this area is crucial to improve the accuracy and reliability of these methods and tools for fake news detection.
Article
Online misinformation promotes distrust in science, undermines public health, and may drive civil unrest. During the coronavirus disease 2019 pandemic, Facebook—the world’s largest social media company—began to remove vaccine misinformation as a matter of policy. We evaluated the efficacy of these policies using a comparative interrupted time-series design. We found that Facebook removed some antivaccine content, but we did not observe decreases in overall engagement with antivaccine content. Provaccine content was also removed, and antivaccine content became more misinformative, more politically polarized, and more likely to be seen in users’ newsfeeds. We explain these findings as a consequence of Facebook’s system architecture, which provides substantial flexibility to motivated users who wish to disseminate misinformation through multiple channels. Facebook’s architecture may therefore afford antivaccine content producers several means to circumvent the intent of misinformation removal policies.
Article
Full-text available
El presente artículo realiza una exploración sociológica de la desinformación a partir del análisis de bulos de carácter islamófobo en España durante los primeros meses de la irrupción de la COVID-19. Nuestra hipótesis sitúa el foco de la investigación en las dinámicas sociológicas surgidas a partir de la interacción entre las networked connections, el entorno informativo asimétrico y la irrupción de la derecha radical. Tras la explicación de la metodología escogida -de carácter cualitativo- y a la luz de la hipótesis planteada, el artículo identifica y expone tres narrativas principales extraídas del análisis de 17 noticias falsas de carácter islamófobo, que corroboran los resultados de estudios previos y posibilitan el análisis tanto del contenido islamófobo de la muestra escogida de bulos como de las dinámicas sociológicas que explican la génesis y expansión del fenómeno desinformativo a partir del estudio de caso seleccionado.
Article
Full-text available
The World Economic Forum listed massive digital misinformation as one of the main threats for our society. The spreading of unsubstantiated rumors may have serious consequences on public opinion such as in the case of rumors about Ebola causing disruption to health-care workers. In this work we target Facebook to characterize information consumption patterns of 1.2 M Italian users with respect to verified (science news) and unverified (conspiracy news) contents. Through a thorough quantitative analysis we provide important insights about the anatomy of the system across which misinformation might spread. In particular, we show that users’ engagement on verified (or unverified) content correlates with the number of friends having similar consumption patterns (homophily). Finally, we measure how this social system responded to the injection of 4,709 false information. We find that the frequent (and selective) exposure to specific kind of content (polarization) is a good proxy for the detection of homophile clusters where certain kind of rumors are more likely to spread.
Conference Paper
Full-text available
While most online social media accounts are controlled by humans, these platforms also host automated agents called social bots or sybil accounts. Recent literature reported on cases of social bots imitating humans to manipulate discussions, alter the popularity of users, pollute content and spread misinformation, and even perform terrorist propaganda and recruitment actions. Here we present BotOrNot, a publicly-available service that leverages more than one thousand features to evaluate the extent to which a Twitter account exhibits similarity to the known characteristics of social bots. Since its release in May 2014, BotOrNot has served over one million requests via our website and APIs.
Article
Full-text available
Spam in online social networks (OSNs) is a systemic problem that imposes a threat to these services in terms of undermining their value to advertisers and potential investors, as well as negatively affecting users’ engagement. As spammers continuously keep creating newer accounts and evasive techniques upon being caught, a deeper understanding of their spamming strategies is vital to the design of future social media defense mechanisms. In this work, we present a unique analysis of spam accounts in OSNs viewed through the lens of their behavioral characteristics. Our analysis includes over 100 million messages collected from Twitter over the course of 1 month. We show that there exist two behaviorally distinct categories of spammers and that they employ different spamming strategies. Then, we illustrate how users in these two categories demonstrate different individual properties as well as social interaction patterns. Finally, we analyze the detectability of spam accounts with respect to three categories of features, namely content attributes, social interactions, and profile properties.
Article
Full-text available
Significance The wide availability of user-provided content in online social media facilitates the aggregation of people around common interests, worldviews, and narratives. However, the World Wide Web is a fruitful environment for the massive diffusion of unverified rumors. In this work, using a massive quantitative analysis of Facebook, we show that information related to distinct narratives––conspiracy theories and scientific news––generates homogeneous and polarized communities (i.e., echo chambers) having similar information consumption patterns. Then, we derive a data-driven percolation model of rumor spreading that demonstrates that homogeneity and polarization are the main determinants for predicting cascades’ size.
Conference Paper
Full-text available
The goal of this work is to introduce a simple modeling framework to study the diffusion of hoaxes and in particular how the availability of debunking information may contain their diffusion. As traditionally done in the mathematical modeling of information diffusion processes, we regard hoaxes as viruses: users can become infected if they are exposed to them, and turn into spreaders as a consequence. Upon verification, users can also turn into non-believers and spread the same attitude with a mechanism analogous to that of the hoax-spreaders. Both believers and non-believers, as time passes, can return to a susceptible state. Our model is characterized by four parameters: spreading rate, gullibility, probability to verify a hoax, and that to forget one's current belief. Simulations on homogeneous, heterogeneous, and real networks for a wide range of parameters values reveal a threshold for the fact-checking probability that guarantees the complete removal of the hoax from the network. Via a mean field approximation, we establish that the threshold value does not depend on the spreading rate but only on the gullibility and forgetting probability. Our approach allows to quantitatively gauge the minimal reaction necessary to eradicate a hoax.
Article
Full-text available
Traditional fact checking by expert journalists cannot keep up with the enormous volume of information that is now generated online. Computational fact checking may significantly enhance our ability to evaluate the veracity of dubious information. Here we show that the complexities of human fact checking can be approximated quite well by finding the shortest path between concept nodes under properly defined semantic proximity metrics on knowledge graphs. Framed as a network problem this approach is feasible with efficient computational techniques. We evaluate this approach by examining tens of thousands of claims related to history, entertainment, geography, and biographical information using a public knowledge graph extracted from Wikipedia. Statements independently known to be true consistently receive higher support via our method than do false ones. These findings represent a significant step toward scalable computational fact-checking methods that may one day mitigate the spread of harmful misinformation.
Article
Full-text available
The Boston Marathon bombing story unfolded on every possible carrier of information available in the spring of 2013, including Twitter. As information spread, it was filled with rumors (unsubstantiated information), and many of these rumors contained misinformation. Earlier studies have suggested that crowdsourced information flows can correct misinformation, and our research investigates this proposition. This exploratory research examines three rumors, later demonstrated to be false, that circulated on Twitter in the aftermath of the bombings. Our findings suggest that corrections to the misinformation emerge but are muted compared with the propagation of the misinformation. The similarities and differences we observe in the patterns of the misinformation and corrections contained within the stream over the days that followed the attacks suggest directions for possible research strategies to automatically detect misinformation.
Article
Full-text available
The large availability of user provided contents on online social media facilitates people aggregation around common interests, worldviews and narratives. However, in spite of the enthusiastic rhetoric about the so called {\em wisdom of crowds}, unsubstantiated rumors -- as alternative explanation to main stream versions of complex phenomena -- find on the Web a natural medium for their dissemination. In this work we study, on a sample of 1.2 million of individuals, how information related to very distinct narratives -- i.e. main stream scientific and alternative news -- are consumed on Facebook. Through a thorough quantitative analysis, we show that distinct communities with similar information consumption patterns emerge around distinctive narratives. Moreover, consumers of alternative news (mainly conspiracy theories) result to be more focused on their contents, while scientific news consumers are more prone to comment on alternative news. We conclude our analysis testing the response of this social system to 4709 troll information -- i.e. parodistic imitation of alternative and conspiracy theories. We find that, despite the false and satirical vein of news, usual consumers of conspiracy news are the most prone to interact with them.
Article
Full-text available
The Turing test asked whether one could recognize the behavior of a human from that of a computer algorithm. Today this question has suddenly become very relevant in the context of social media, where text constraints limit the expressive power of humans, and real incentives abound to develop human-mimicking software agents called social bots. These elusive entities wildly populate social media ecosystems, often going unnoticed among the population of real people. Bots can be benign or harmful, aiming at persuading, smearing, or deceiving. Here we discuss the characteristics of modern, sophisticated social bots, and how their presence can endanger online ecosystems and our society. We then discuss current efforts aimed at detection of social bots in Twitter. Characteristics related to content, network, sentiment, and temporal patterns of activity are imitated by bots but at the same time can help discriminate synthetic behaviors from human ones, yielding signatures of engineered social tampering.
Conference Paper
Full-text available
In today's world, online social media plays a vital role during real world events, especially crisis events. There are both positive and negative effects of social media coverage of events, it can be used by authorities for effective disaster management or by malicious entities to spread rumors and fake news. The aim of this paper, is to highlight the role of Twitter, during Hurricane Sandy (2012) to spread fake images about the disaster. We identified 10,350 unique tweets containing fake images that were circulated on Twitter, during Hurricane Sandy. We performed a characterization analysis, to understand the temporal, social reputation and influence patterns for the spread of fake images. Eighty six percent of tweets spreading the fake images were retweets, hence very few were original tweets. Our results showed that top thirty users out of 10,215 users (0.3%) resulted in 90% of the retweets of fake images; also network links such as follower relationships of Twitter, contributed very less (only 11%) to the spread of these fake photos URLs. Next, we used classification models, to distinguish fake images from real images of Hurricane Sandy. Best results were obtained from Decision Tree classifier, we got 97% accuracy in predicting fake images from real. Also, tweet based features were very effective in distinguishing fake images tweets from real, while the performance of user based features was very poor. Our results, showed that, automated techniques can be used in identifying real images from fake images posted on Twitter.
Conference Paper
Full-text available
Characterizing information diffusion on social platforms like Twitter enables us to understand the properties of underlying media and model communication patterns. As Twitter gains in popularity, it has also become a venue to broadcast rumors and misinformation. We use epidemiological models to characterize information cascades in twitter resulting from both news and rumors. Specifically, we use the SEIZ enhanced epidemic model that explicitly recognizes skeptics to characterize eight events across the world and spanning a range of event types. We demonstrate that our approach is accurate at capturing diffusion in these events. Our approach can be fruitfully combined with other strategies that use content modeling and graph theoretic features to detect (and possibly disrupt) rumors.
Conference Paper
Full-text available
Twitter is useful in a situation of disaster for communication, announcement, request for rescue and so on. On the other hand, it causes a negative by-product, spreading rumors. This paper describe how rumors have spread after a disaster of earthquake, and discuss how can we deal with them. We first investigated actual instances of rumor after the disaster. And then we attempted to disclose characteristics of those rumors. Based on the investigation we developed a system which detects candidates of rumor from twitter and then evaluated it. The result of experiment shows the proposed algorithm can find rumors with acceptable accuracy.
Article
Full-text available
Detecting emotions in microblogs and social media posts has applications for industry, health, and security. Statistical, supervised automatic methods for emotion detection rely on text that is labeled for emotions, but such data are rare and available for only a handful of basic emotions. In this article, we show that emotion-word hashtags are good manual labels of emotions in tweets. We also propose a method to generate a large lexicon of word–emotion associations from this emotion-labeled tweet corpus. This is the first lexicon with real-valued word–emotion association scores. We begin with experiments for six basic emotions and show that the hashtag annotations are consistent and match with the annotations of trained judges. We also show how the extracted tweet corpus and word–emotion associations can be used to improve emotion classification accuracy in a different nontweet domain.Eminent psychologist Robert Plutchik had proposed that emotions have a relationship with personality traits. However, empirical experiments to establish this relationship have been stymied by the lack of comprehensive emotion resources. Because personality may be associated with any of the hundreds of emotions and because our hashtag approach scales easily to a large number of emotions, we extend our corpus by collecting tweets with hashtags pertaining to 585 fine emotions. Then, for the first time, we present experiments to show that fine emotion categories such as those of excitement, guilt, yearning, and admiration are useful in automatically detecting personality from text. Stream-of-consciousness essays and collections of Facebook posts marked with personality traits of the author are used as test sets.
Article
Full-text available
The announcement of the discovery of a Higgs boson-like particle at CERN will be remembered as one of the milestones of the scientific endeavor of the 21(st) century. In this paper we present a study of information spreading processes on Twitter before, during and after the announcement of the discovery of a new particle with the features of the elusive Higgs boson on 4(th) July 2012. We report evidence for non-trivial spatio-temporal patterns in user activities at individual and global level, such as tweeting, re-tweeting and replying to existing tweets. We provide a possible explanation for the observed time-varying dynamics of user activities during the spreading of this scientific "rumor". We model the information spreading in the corresponding network of individuals who posted a tweet related to the Higgs boson discovery. Finally, we show that we are able to reproduce the global behavior of about 500,000 individuals with remarkable accuracy.
Article
Full-text available
Even though considerable attention has been given to the polarity of words (positive and negative) and the creation of large polarity lexicons, research in emotion analysis has had to rely on limited and small emotion lexicons. In this paper we show how the combined strength and wisdom of the crowds can be used to generate a large, high-quality, word-emotion and word-polarity association lexicon quickly and inexpensively. We enumerate the challenges in emotion annotation in a crowdsourcing scenario and propose solutions to address them. Most notably, in addition to questions about emotions associated with terms, we show how the inclusion of a word choice question can discourage malicious data entry, help identify instances where the annotator may not be familiar with the target term (allowing us to reject such annotations), and help obtain annotations at sense level (rather than at word level). We conducted experiments on how to formulate the emotion-annotation questions, and show that asking if a term is associated with an emotion leads to markedly higher inter-annotator agreement than that obtained by asking if a term evokes an emotion.
Conference Paper
Full-text available
Models of networked diffusion that are motivated by analogy with the spread of infectious disease have been applied to a wide range of social and economic adoption processes, including those related to new products, ideas, norms and behaviors. However, it is unknown how accurately these models account for the empirical structure of diffusion over networks. Here we describe the diffusion patterns arising from seven online domains, ranging from communications platforms to networked games to microblogging services, each involving distinct types of content and modes of sharing. We find strikingly similar patterns across all domains. In particular, the vast majority of cascades are small, and are described by a handful of simple tree structures that terminate within one degree of an initial adopting "seed." In addition we find that structures other than these account for only a tiny fraction of total adoptions; that is, adoptions resulting from chains of referrals are extremely rare. Finally, even for the largest cascades that we observe, we find that the bulk of adoptions often takes place within one degree of a few dominant individuals. Together, these observations suggest new directions for modeling of online adoption processes.
Conference Paper
Full-text available
In this article we explore the behavior of Twitter users under an emergency situation. In particular, we analyze the activity related to the 2010 earthquake in Chile and characterize Twitter in the hours and days following this disaster. Furthermore, we perform a pre-liminary study of certain social phenomenons, such as the dissem-ination of false rumors and confirmed news. We analyze how this information propagated through the Twitter network, with the pur-pose of assessing the reliability of Twitter as an information source under extreme circumstances. Our analysis shows that the propa-gation of tweets that correspond to rumors differs from tweets that spread news because rumors tend to be questioned more than news by the Twitter community. This result shows that it is posible to detect rumors by using aggregate analysis on tweets.
Conference Paper
Full-text available
Even though considerable attention has been given to semantic orientation of words and the creation of large polarity lexicons, research in emotion analysis has had to rely on limited and small emotion lexicons. In this paper, we show how we create a high-quality, moderate-sized emotion lexicon using Mechanical Turk. In addition to questions about emotions evoked by terms, we show how the inclusion of a word choice question can discourage malicious data entry, help identify instances where the annotator may not be familiar with the target term (allowing us to reject such annotations), and help obtain annotations at sense level (rather than at word level). We perform an extensive analysis of the annotations to better understand the distribution of emotions evoked by terms of different parts of speech. We identify which emotions tend to be evoked simultaneously by the same term and show that certain emotions indeed go hand in hand.
Conference Paper
Full-text available
Due to its rapid speed of information spread, wide user bases, and extreme mobility, Twitter is drawing attention as a potential emergency reporting tool under extreme events. However, at the same time, Twitter is sometimes despised as a citizen based non-professional social medium for propagating misinformation, rumors, and, in extreme case, propaganda. This study explores the working dynamics of the rumor mill by analyzing Twitter data of the Haiti Earthquake in 2010. For this analysis, two key variables of anxiety and informational uncertainty are derived from rumor theory, and their interactive dynamics are measured by both quantitative and qualitative methods. Our research finds that information with credible sources contribute to suppress the level of anxiety in Twitter community, which leads to rumor control and high information quality.
Conference Paper
Full-text available
Twitter as a new form of social media can potentially contain much useful information, but content analysis on Twitter has not been well studied. In particular, it is not clear whether as an information source Twitter can be simply regarded as a faster news feed that covers mostly the same information as traditional news media. In This paper we empirically compare the content of Twitter with a traditional news medium, New York Times, using unsupervised topic modeling. We use a Twitter-LDA model to discover topics from a representative sample of the entire Twitter. We then use text mining techniques to compare these Twitter topics with topics from New York Times, taking into consideration topic categories and types. We also study the relation between the proportions of opinionated tweets and retweets and topic categories and types. Our comparisons show interesting and useful findings for downstream IR or DM applications.
Conference Paper
Full-text available
We analyze the information credibility of news propagated through Twitter, a popular microblogging service. Previous research has shown that most of the messages posted on Twitter are truthful, but the service is also used to spread misinformation and false rumors, often unintentionally. On this paper we focus on automatic methods for assessing the credibility of a given set of tweets. Specifically, we analyze microblog postings related to "trending" topics, and classify them as credible or not credible, based on features extracted from them. We use features from users' posting and re-posting ("re-tweeting") behavior, from the text of the posts, and from citations to external sources. We evaluate our methods using a significant number of human assessments about the credibility of items on a recent sample of Twitter postings. Our results shows that there are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.
Article
The spread of malicious or accidental misinformation in social media, especially in time-sensitive situations, such as real-world emergencies, can have harmful effects on individuals and society. In this work, we developed models for automated verification of rumors (unverified information) that propagate through Twitter. To predict the veracity of rumors, we identified salient features of rumors by examining three aspects of information spread: linguistic style used to express rumors, characteristics of people involved in propagating information, and network propagation dynamics. The predicted veracity of a time series of these features extracted from a rumor (a collection of tweets) is generated using Hidden Markov Models. The verification algorithm was trained and tested on 209 rumors representing 938,806 tweets collected from real-world events, including the 2013 Boston Marathon bombings, the 2014 Ferguson unrest, and the 2014 Ebola epidemic, and many other rumors about various real-world events reported on popular websites that document public rumors. The algorithm was able to correctly predict the veracity of 75% of the rumors faster than any other public source, including journalists and law enforcement officials. The ability to track rumors and predict their outcomes may have practical applications for news consumers, financial markets, journalists, and emergency services, and more generally to help minimize the impact of false information on Twitter.
Conference Paper
Cascades of information-sharing are a primary mechanism by which content reaches its audience on social media, and an active line of research has studied how such cascades, which form as content is reshared from person to person, develop and subside. In this paper, we perform a large-scale analysis of cascades on Facebook over significantly longer time scales, and find that a more complex picture emerges, in which many large cascades recur, exhibiting multiple bursts of popularity with periods of quiescence in between. We characterize recurrence by measuring the time elapsed between bursts, their overlap and proximity in the social network, and the diversity in the demographics of individuals participating in each peak. We discover that content virality, as revealed by its initial popularity, is a main driver of recurrence, with the availability of multiple copies of that content helping to spark new bursts. Still, beyond a certain popularity of content, the rate of recurrence drops as cascades start exhausting the population of interested individuals. We reproduce these observed patterns in a simple model of content recurrence simulated on a real social network. Using only characteristics of a cascade's initial burst, we demonstrate strong performance in predicting whether it will recur in the future.
Conference Paper
We present Tweet2Vec, a novel method for generating general-purpose vector representation of tweets. The model learns tweet embeddings using character-level CNN-LSTM encoder-decoder. We trained our model on 3 million, randomly selected English-language tweets. The model was evaluated using two methods: tweet semantic similarity and tweet sentiment categorization, outperforming the previous state-of-the-art in both tasks. The evaluations demonstrate the power of the tweet embeddings generated by our model for various tweet categorization tasks. The vector representations generated by our model are generic, and hence can be applied to a variety of tasks. Though the model presented in this paper is trained on English-language tweets, the method presented can be used to learn tweet embeddings for different languages.
Conference Paper
Many previous techniques identify trending topics in social media, even topics that are not pre-defined. We present a technique to identify trending rumors, which we define as topics that include disputed factual claims. Putting aside any attempt to assess whether the rumors are true or false, it is valuable to identify trending rumors as early as possible. It is extremely difficult to accurately classify whether every individual post is or is not making a disputed factual claim. We are able to identify trending rumors by recasting the problem as finding entire clusters of posts whose topic is a disputed factual claim. The key insight is that when there is a rumor, even though most posts do not raise questions about it, there may be a few that do. If we can find signature text phrases that are used by a few people to express skepticism about factual claims and are rarely used to express anything else, we can use those as detectors for rumor clusters. Indeed, we have found a few phrases that seem to be used exactly that way, including: "Is this true?", "Really?", and "What?". Relatively few posts related to any particular rumor use any of these enquiry phrases, but lots of rumor diffusion processes have some posts that do and have them quite early in the diffusion. We have developed a technique based on searching for the enquiry phrases, clustering similar posts together, and then collecting related posts that do not contain these simple phrases. We then rank the clusters by their likelihood of really containing a disputed factual claim. The detector, which searches for the very rare but very informative phrases, combined with clustering and a classifier on the clusters, yields surprisingly good performance. On a typical day of Twitter, about a third of the top 50 clusters were judged to be rumors, a high enough precision that human analysts might be willing to sift through them.
Article
Twitter has become one of the main sources of news for many people. As real-world events and emergencies unfold, Twitter is abuzz with hundreds of thousands of stories about the events. Some of these stories are harmless, while others could potentially be life-saving or sources of malicious rumors. Thus, it is critically important to be able to efficiently track stories that spread on Twitter during these events. In this paper, we present a novel semi-automatic tool that enables users to efficiently identify and track stories about real-world events on Twitter. We ran a user study with 25 participants, demonstrating that compared to more conventional methods, our tool can increase the speed and the accuracy with which users can track stories about real-world events.
Article
Online social networks provide a rich substrate for rumor propagation. Information received via friends tends to be trusted, and online social networks allow individuals to transmit information to many friends at once. By referencing known rumors from Snopes.com, a popular website documenting memes and urban legends, we track the propagation of thousands of rumors appearing on Facebook. From this sample we infer the rates at which rumors from different categories and of varying truth value are uploaded and reshared. We find that rumor cascades run deeper in the social network than reshare cascades in general. We then examine the effect of individual reshares receiving a comment containing a link to a Snopes article on the evolution of the cascade. We find that receiving such a comment increases the likelihood that a reshare of a rumor will be deleted. Furthermore, large cascades are able to accumulate hundreds of Snopes comments while continuing to propagate. Finally, using a dataset of rumors copied and pasted from one status update to another, we show that rumors change over time and that different variants tend to dominate different bursts in popularity. Copyright © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Article
Though Twitter acts as a realtime news source with people acting as sensors and sending event updates from all over the world, rumors spread via Twitter have been noted to cause considerable damage. Given a set of popular Twitter events along with related users and tweets, we study the problem of automatically assessing the credibility of such events. We propose a credibility analysis approach enhanced with event graph-based optimization to solve the problem. First we experiment by performing PageRanklike credibility propagation on a multi-typed network consisting of events, tweets, and users. Further, within each iteration, we enhance the basic trust analysis by updating event credibility scores using regularization on a new graph of events. Our experiments using events extracted from two tweet feed datasets, each with millions of tweets show that our event graph optimization approach outperforms the basic credibility analysis approach. Also, our methods are significantly more accurate (∼86%) than the decision tree classifier approach (∼72%). Copyright
Article
Twitter and other social media are now a major method of information exchange and dissemination. Although they can support rapid communication and sharing of useful information, they can also facilitate the spread of rumors, which contain unverified information. The purpose of the work reported here was to examine several design ideas for reducing the spread of health-related rumors in a Twitter-like environment. The results have shown that exposing people to information that refutes rumors or warns that the statement has appeared on rumor websites could reduce the spread of rumors. These results suggest that social media technologies can be designed such that users can self correct and inactivate potentially inaccurate information in their environment.
Conference Paper
Social media have become an established feature of the dynamic information space that emerges during crisis events. Both emergency responders and the public use these platforms to search for, disseminate, challenge, and make sense of information during crises. In these situations rumors also proliferate, but just how fast such information can spread is an open question. We address this gap, modeling the speed of information transmission to compare retransmission times across content and context features. We specifically contrast rumor-affirming messages with rumor-correcting messages on Twitter during a notable hostage crisis to reveal differences in transmission speed. Our work has important implications for the growing field of crisis informatics.
Article
Viral products and ideas are intuitively understood to grow through a person-to-person diffusion process analogous to the spread of an infectious disease; however, until recently it has been prohibitively difficult to directly observe purportedly viral events, and thus to rigorously quantify or characterize their structural properties. Here we propose a formal measure of what we label “structural virality” that interpolates between two conceptual extremes: content that gains its popularity through a single, large broadcast and that which grows through multiple generations with any one individual directly responsible for only a fraction of the total adoption. We use this notion of structural virality to analyze a unique data set of a billion diffusion events on Twitter, including the propagation of news stories, videos, images, and petitions. We find that across all domains and all sizes of events, online diffusion is characterized by surprising structural diversity; that is, popular events regularly grow via both broadcast and viral mechanisms, as well as essentially all conceivable combinations of the two. Nevertheless, we find that structural virality is typically low, and remains so independent of size, suggesting that popularity is largely driven by the size of the largest broadcast. Finally, we attempt to replicate these findings with a model of contagion characterized by a low infection rate spreading on a scale-free network. We find that although several of our empirical findings are consistent with such a model, it fails to replicate the observed diversity of structural virality, thereby suggesting new directions for future modeling efforts. This paper was accepted by Lorin Hitt, information systems.
Article
We consider statistical inference for regression when data are grouped into clusters, with regression model errors independent across clusters but correlated within clusters. Examples include data on individuals with clustering on village or region or other category such as industry, and state- year differences- in- differences studies with clustering on state. In such settings, default standard errors can greatly overstate estimator precision. Instead, if the number of clusters is large, statistical inference after OLS should be based on cluster- robust standard errors. We outline the basic method as well as many complications that can arise in practice. These include cluster- specifi c fi xed effects, few clusters, multiway clustering, and estimators other than OLS. © 2015 by the Board of Regents of the University of Wisconsin System.
Article
Many machine learning algorithms require the input to be represented as a fixed-length feature vector. When it comes to texts, one of the most common fixed-length features is bag-of-words. Despite their popularity, bag-of-words features have two major weaknesses: they lose the ordering of the words and they also ignore semantics of the words. For example, "powerful," "strong" and "Paris" are equally distant. In this paper, we propose Paragraph Vector, an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents. Our algorithm represents each document by a dense vector which is trained to predict words in the document. Its construction gives our algorithm the potential to overcome the weaknesses of bag-of-words models. Empirical results show that Paragraph Vectors outperforms bag-of-words models as well as other techniques for text representations. Finally, we achieve new state-of-the-art results on several text classification and sentiment analysis tasks.
Article
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.
Article
A few hubs with many connections share with many individuals with few connections.
Article
Bell System Technical Journal, also pp. 623-656 (October)
Article
Minimization of the error probability to determine optimum signals is often difficult to carry out. Consequently, several suboptimum performance measures that are easier than the error probability to evaluate and manipulate have been studied. In this partly tutorial paper, we compare the properties of an often used measure, the divergence, with a new measure that we have called the Bhattacharyya distance. This new distance measure is often easier to evaluate than the divergence. In the problems we have worked, it gives results that are at least as good as, and are often better, than those given by the divergence.
Conference Paper
In this paper we study and evaluate rumor-like methods for combating the spread of rumors on a social network. We model rumor spread as a diffusion process on a network and suggest the use of an "anti-rumor" process similar to the rumor process. We study two natural models by which these anti-rumors may arise. The main metrics we study are the belief time, i.e., the duration for which a person believes the rumor to be true and point of decline, i.e., point after which anti-rumor process dominates the rumor process. We evaluate our methods by simulating rumor spread and anti-rumor spread on a data set derived from the social networking site Twitter and on a synthetic network generated according to the Watts and Strogatz model. We find that the lifetime of a rumor increases if the delay in detecting it increases, and the relationship is at least linear. Further our findings show that coupling the detection and anti-rumor strategy by embedding agents in the network, we call them beacons, is an effective means of fighting the spread of rumor, even if these beacons do not share information.
Conference Paper
A rumor is commonly defined as a statement whose true value is unverifiable. Rumors may spread misinformation (false information) or disinformation (deliberately false information) on a network of people. Identifying rumors is crucial in online social media where large amounts of information are easily spread across a large network by sources with unverified authority. In this paper, we address the problem of rumor detection in microblogs and explore the effectiveness of 3 categories of features: content-based, network-based, and microblog-specific memes for correctly identifying rumors. Moreover, we show how these features are also effective in identifying disinformers, users who endorse a rumor and further help it to spread. We perform our experiments on more than 10,000 manually annotated tweets collected from Twitter and show how our retrieval model achieves more than 0.95 in Mean Average Precision (MAP). Finally, we believe that our dataset is the first large-scale dataset on rumor detection. It can open new dimensions in analyzing online misinformation and other aspects of microblog conversations. 1
Article
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.