Claudia Wagner

Claudia Wagner
GESIS - Leibniz-Institute for the Social Sciences | GESIS · Department of Computational Social Science

About

90
Publications
26,696
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,744
Citations

Publications

Publications (90)
Article
Full-text available
Open science practices have been widely discussed and have been implemented with varying success in different disciplines. We argue that computational-x disciplines such as computational social science, are also susceptible to the symptoms of the crises, but in terms of reproducibility. We expand the binary definition of reproducibility into a tier...
Preprint
Surveys are a cornerstone of empirical social science research, providing invaluable insights into the opinions, beliefs, behaviours, and characteristics of people. However, issues such as refusal to participate, skipping questions, sampling bias, and attrition significantly impact the quality and reliability of survey data. Recently, researchers h...
Preprint
Pairwise comparisons based on human judgements are an effective method for determining rankings of items or individuals. However, as human biases perpetuate from pairwise comparisons to recovered rankings, they affect algorithmic decision making. In this paper, we introduce the problem of fairness-aware ranking recovery from pairwise comparisons. W...
Article
Machine learning (ML)-based content moderation tools are essential to keep online spaces free from hateful communication. Yet ML tools can only be as capable as the quality of the data they are trained on allows them. While there is increasing evidence that they underperform in detecting hateful communications directed towards specific identities a...
Preprint
Full-text available
There is an increase in the proliferation of online hate commensurate with the rise in the usage of social media. In response, there is also a significant advancement in the creation of automated tools aimed at identifying harmful text content using approaches grounded in Natural Language Processing and Deep Learning. Although it is known that trai...
Article
Full-text available
Human feedback is often used, either directly or indirectly, as input to algorithmic decision making. However, humans are biased: if the algorithm that takes as input the human feedback does not control for potential biases, this might result in biased algorithmic decision making, which can have a tangible impact on people’s lives. In this paper, w...
Article
The characterization and detection of bots with their presumed ability to manipulate society on social media platforms have been subject to many research endeavors over the last decade. In the absence of ground truth data (i.e., accounts that are labeled as bots by experts or self-declare their automated nature), researchers interested in the chara...
Article
Full-text available
Inequality prevails in science. Individual inequality means that most perish quickly and only a few are successful, while gender inequality implies that there are differences in achievements for women and men. Using large-scale bibliographic data and following a computational approach, we study the evolution of individual and gender inequality for...
Article
Full-text available
We illustrate how standard psychometric inventories originally designed for assessing noncognitive human traits can be repurposed as diagnostic tools to evaluate analogous traits in large language models (LLMs). We start from the assumption that LLMs, inadvertently yet inevitably, acquire psychological traits (metaphorically speaking) from the vast...
Article
This review paper provides a conceptualization of AI-assisted content moderation with various degrees of autonomy and summarizes experimental evidence for how different levels of automation in content moderation and related losses of autonomy affect individuals and groups. Our results show that current research predominantly focuses on individual l...
Preprint
Full-text available
The hipster paradox in Electronic Dance Music is the phenomenon that commercial success is collectively considered illegitimate while serious and aspiring professional musicians strive for it. We study this behavioral dilemma using digital traces of performing live and releasing music as they are stored in the \textit{Resident Advisor}, \textit{Jun...
Article
The hipster paradox in Electronic Dance Music is the phenomenon that commercial success is collectively considered illegitimate while serious and aspiring professional musicians strive for it. We study this behavioral dilemma using digital traces of performing live and releasing music as they are stored in the Resident Advisor, Juno Download, and D...
Preprint
Full-text available
Network-based people recommendation algorithms are widely employed on the Web to suggest new connections in social media or professional platforms. While such recommendations bring people together, the feedback loop between the algorithms and the changes in network structure may exacerbate social biases. These biases include rich-get-richer effects...
Preprint
Full-text available
Counterfactually Augmented Data (CAD) aims to improve out-of-domain generalizability, an indicator of model robustness. The improvement is credited with promoting core features of the construct over spurious artifacts that happen to correlate with it. Yet, over-relying on core features may lead to unintended model bias. Especially, construct-driven...
Article
Full-text available
Though algorithms promise many benefits including efficiency, objectivity and accuracy, they may also introduce or amplify biases. Here we study two well-known algorithms, namely PageRank and Who-to-Follow (WTF), and show to what extent their ranks produce inequality and inequity when applied to directed social networks. To this end, we propose a d...
Article
Full-text available
Social networks are very important carriers of information. For instance, the political leaning of our friends can serve as a proxy to identify our own political preferences. This explanatory power is leveraged in many scenarios ranging from business decision-making to scientific research to infer missing attributes using machine learning. However,...
Preprint
Full-text available
Though algorithms promise many benefits including efficiency, objectivity and accuracy, they may also introduce or amplify biases. Here we study two well-known algorithms, namely PageRank and Who-to-Follow (WTF), and show under which circumstances their ranks produce inequality and inequity when applied to directed social networks. To this end, we...
Preprint
As NLP models are increasingly deployed in socially situated settings such as online abusive content detection, it is crucial to ensure that these models are robust. One way of improving model robustness is to generate counterfactually augmented data (CAD) for training models that can better learn to distinguish between core features and data artif...
Article
People’s activities and opinions recorded as digital traces online, especially on social media and other web-based platforms, offer increasingly informative pictures of the public. They promise to allow inferences about populations beyond the users of the platforms on which the traces are recorded, representing real potential for the social science...
Article
Online community managers work towards building and managing communities around a given brand or topic. A risk imposed on such managers is that their community may die out and its utility diminish to users. Understanding what drives attention to content and the dynamics of discussions in a given community informs the community manager and/or host w...
Article
It has been the historic responsibility of the social sciences to investigate human societies. Fulfilling this responsibility requires social theories, measurement models and social data. Most existing theories and measurement models in the social sciences were not developed with the deep societal reach of algorithms in mind. The emergence of ‘algo...
Article
Full-text available
Research has focused on automated methods to effectively detect sexism online. Although overt sexism seems easy to spot, its subtle forms and manifold expressions are not. In this paper, we outline the different dimensions of sexism by grounding them in their implementation in psychological scales. From the scales, we derive a codebook for sexism i...
Preprint
Measures of algorithmic fairness often do not account for human perceptions of fairness that can substantially vary between different sociodemographics and stakeholders. The FairCeptron framework is an approach for studying perceptions of fairness in algorithmic decision making such as in ranking or classification. It supports (i) studying human pe...
Article
Full-text available
Data sharing, research ethics, and incentives must improve
Preprint
Full-text available
To effectively tackle sexism online, research has focused on automated methods for detecting sexism. In this paper, we use items from psychological scales and adversarial sample generation to 1) provide a codebook for different types of sexism in theory-driven scales and in social media text; 2) test the performance of different sexism detection me...
Article
Full-text available
Artificial Intelligence (AI)‐based systems are widely employed nowadays to make decisions that have far‐reaching impact on individuals and society. Their decisions might affect everyone, everywhere, and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for...
Article
Full-text available
People’s perceptions about the size of minority groups in social networks can be biased, often showing systematic over- or underestimation. These social perception biases are often attributed to biased cognitive or motivational processes. Here we show that both over- and underestimation of the size of a minority group can emerge solely from structu...
Preprint
Full-text available
The interactions and activities of hundreds of millions of people worldwide are recorded as digital traces every single day. When pulled together, these data offer increasingly comprehensive pictures of both individuals and groups interacting on different platforms, but they also allow inferences about broader target populations beyond those platfo...
Book
This volume constitutes the proceedings of the 11th International Conference on Social Informatics, SocInfo 2019, held in Doha, Qatar, in November 2019. The 17 full and 5 short papers presented in these proceedings were carefully reviewed and selected from 86 submissions. The papers presented in this volume cover a broad range of topics, ranging fr...
Preprint
Full-text available
Do only major scientific breakthroughs hit the news and social media, or does a 'catchy' title help to attract public attention? How strong is the connection between the importance of a scientific paper and the (social) media attention it receives? In this study we investigate these questions by analysing the relationship between the observed atten...
Preprint
Full-text available
Do only major scientific breakthroughs hit the news and social media, or does a 'catchy' title help to attract public attention? How strong is the connection between the importance of a scientific paper and the (social) media attention it receives? In this study we investigate these questions by analysing the relationship between the observed atten...
Article
Full-text available
Homophily can put minority groups at a disadvantage by restricting their ability to establish links with a majority group or to access novel information. Here, we show how this phenomenon can influence the ranking of minorities in examples of real-world networks with various levels of heterophily and homophily ranging from sexual contacts, dating c...
Article
Emergent patterns of collective attention towards scientists and their research may function as a proxy for scientific impact which traditionally is assessed via committees that award prizes to scientists. Therefore it is crucial to understand the relationships between scientific impact and online demand and supply for information about scientists...
Article
Full-text available
Relational inference leverages relationships between entities and links in a network to infer information about the network from a small sample. This method is often used when global information about the network is not available or difficult to obtain. However, how reliable is inference from a small labelled sample? How should the network be sampl...
Article
Full-text available
Individual's perceptions about the prevalence of attributes in their social networks is commonly skewed by the limited information available to them. Filter bubbles -- being exposed to other like-minded people -- and majority illusion -- overestimation of minorities in social networks -- are two examples of how perception biases can manifest. In th...
Article
Full-text available
Previous research has shown the existence of gender biases in the depiction of professions and occupations in search engine results. Such an unbalanced presentation might just as likely occur on Wikipedia, one of the most popular knowledge resources on the Web, since the encyclopedia has already been found to exhibit such tendencies in past studies...
Article
Scientific collaborations shape novel ideas and new discoveries and help scientists to advance their scientific career through publishing high impact publications and grant proposals. Recent studies however show that gender inequality is still present in many scientific practices ranging from hiring to peer review processes and grant applications....
Conference Paper
Online freelancing marketplaces have grown quickly in recent years. In theory, these sites offer workers the ability to earn money without the obligations and potential social biases associated with traditional employment frameworks. In this paper, we study whether two prominent online freelance marketplaces - TaskRabbit and Fiverr - are impacted b...
Article
Sampling from large networks represents a fundamental challenge for social network research. In this paper, we explore the sensitivity of different sampling techniques (node sampling, edge sampling, random walk sampling, and snowball sampling) on social networks with attributes. We consider the special case of networks (i) where we have one attribu...
Article
Full-text available
Homophily can put minority groups at a disadvantage by restricting their ability to establish links with people from a majority group. This can limit the overall visibility of minorities in the network. Building on a Barab\'{a}si-Albert model variation with groups and homophily, we show how the visibility of minority groups in social networks is a...
Article
Full-text available
Wikipedia articles about the same topic in different language editions are built around different sources of information. For example, one can find very different news articles linked as references in the English Wikipedia article titled "Annexation of Crimea by the Russian Federation" than in its German counterpart (determined via Wikipedia's lang...
Conference Paper
Full-text available
This tutorial aims at outlining fundamental methods for studying typical social science research questions with organic data (i.e., data that has not been designed for a specific research purpose but can be found on the Web). Further, social theories, statistical methods and models that help to understand the processes that generated the data will...
Article
Computational social scientists often harness the Web as a "societal observatory" where data about human social behavior is collected. This data enables novel investigations of psychological, anthropological and sociological research questions. However, in the absence of demographic information, such as gender, many relevant research questions cann...
Article
Full-text available
Contributing to the writing of history has never been as easy as it is today thanks to Wikipedia, a community-created encyclopedia that aims to document the world's knowledge from a neutral point of view. Though everyone can participate it is well known that the editor community has a narrow diversity, with a majority of white male editors. While t...
Conference Paper
For many people, Wikipedia represents one of the primary sources of knowledge about foreign cultures. Yet, different Wikipedia language editions offer different descriptions of cultural practices. Unveiling diverging representations of cultures provides an important insight, since they may foster the formation of cross-cultural stereotypes, misunde...
Conference Paper
Culinary preferences contribute significantly to the sense of ourself [2]. While gender, race, sexuality and ethnicity describe our "major identity", preferences in music, style and food define our "minor identity". However, we find that only certain parts of them can be explained by gender-specific differences in the food consumption behavior, whi...
Article
Wikipedia is a community-created encyclopedia that contains information about notable people from different countries, epochs and disciplines and aims to document the world's knowledge from a neutral point of view. However, the narrow diversity of the Wikipedia editor community has the potential to introduce systemic biases such as gender biases in...
Article
Food is a central element of humans’ life, and food preferences are amongst others manifestations of social, cultural and economic forces that influence the way we view, prepare and consume food. Historically, data for studies of food preferences stems from consumer panels which continuously capture food consumption and preference patterns from ind...
Article
Full-text available
For many people, Wikipedia represents one of the primary sources of knowledge about foreign cultures. Yet, different Wikipedia language editions offer different descriptions of cultural phenomena. Unveiling diverging representations of cultures is an important problem since they may foster the formation of cross-cultural stereotypes, misunderstandi...
Article
Full-text available
Assessing political conversations in social media requires a deeper understanding of the underlying practices and styles that drive these conversations. In this paper, we present a computational approach for assessing online conversational practices of political parties. Following a deductive approach, we devise a number of quantitative measures fr...
Conference Paper
Since food is one of the central elements of all human beings, a high interest exists in exploring temporal and spatial food and dietary patterns of humans. Predominantly, data for such investigations stem from consumer panels which continuously capture food consumption patterns from individuals and households. In this work we leverage data from a...
Article
Full-text available
One potential disadvantage of social tagging systems is that due to the lack of a centralized vocabulary, a crowd of users may never manage to reach a consensus on the description of resources (e.g., books, images, users, or songs) on the Web. Yet, previous research has provided interesting evidence that the tag distributions of resources in social...
Article
Online social networks (OSN) like Twitter or Facebook are popular and powerful since they allow reaching millions of users online. They are also a popular target for socialbot attacks. Without a deep understanding of the impact of such attacks, the potential of online social networks as an instrument for facilitating discourse or democratic process...
Article
In the past, online social networks (OSN) like Facebook and Twitter became powerful instruments for communication and networking. Unfortunately, they have also become a welcome target for socialbot attacks. Therefore, a deep understanding of the nature of such attacks is important to protect the Eco-System of OSNs. In this extended abstract we prop...
Article
Full-text available
One potential disadvantage of social tagging systems is that due to the lack of a centralized vocabulary, a crowd of users may never manage to reach a consensus on the description of resources (e.g., books, users or songs) on the Web. Yet, previous research has provided interesting evidence that the tag distributions of resources may become semanti...
Conference Paper
Finding the "right people" is a central aspect of social media systems. Twitter has millions of users who have varied interests, professions and personalities. For those in fields such as advertising and marketing, it is important to identify certain characteristics of users to target. However, Twitter users do not generally provide sufficient info...
Conference Paper
Full-text available
Interpreting the meaning of a document represents a fundamental challenge for current semantic analysis methods. One interesting aspect mostly neglected by existing methods is that authors of a document usually assume certain background knowledge of their intended audience. Based on this knowledge, authors usually decide what to communicate and how...
Conference Paper
Content injection methods rely on understanding community dynamics (i.e. attention factors) in order to publish content that community users will engage with (e.g. product-related posts), however such methods require re-training should the community's discussed topics change. In this paper we present an examination of the semantic evolution of comm...
Conference Paper
For community managers and hosts it is not only important to identify the current key topics of a community but also to assess the specificity level of the community for: a) creating sub-communities, and: b) anticipating community behaviour and topical evolution. In this paper we present an approach that empirically characterises the topical specif...
Conference Paper
Full-text available
This paper sets out to explore whether data about the usage of hashtags on Twitter contains information about their semantics. Towards that end, we perform initial statistical hypothesis tests to quantify the association between usage patterns and semantics of hashtags. To assess the utility of pragmatic features { which describe how a hashtag is u...
Conference Paper
Anticipating repliers in online conversations is a fundamental challenge for computer mediated communication systems which aim to make textual, audio and/or video communication as natural as face to face communication. The massive amounts of data that social media generates has facilitated the study of online conversations on a scale unimaginable a...
Article
Full-text available
One of the key challenges for users of social media is judging the topical expertise of other users in order to select trustful information sources about specific topics and to judge credibility of content produced by others. In this paper, we explore the usefulness of different types of user-related data for making sense about the topical expertis...
Conference Paper
Full-text available
Online community managers work towards building and managing communities around a given brand or topic. Arisk imposed on such managers is that their community may die out and its utility diminish to users. Understanding what drives attention to content and the dynamics of discussions in a given community informs the community manager and/or host wi...
Article
Social bots are automatic or semi-automatic computer pro-grams that mimic humans and/or human behavior in online social networks. Social bots can attack users (targets) in on-line social networks to pursue a variety of latent goals, such as to spread information or to influence targets. Without a deep understanding of the nature of such attacks or...
Conference Paper
Online community managers work towards building and managing communities around a given brand or topic. A risk imposed on such managers is that their community may die out and its utility diminish to users. Understanding what drives attention to content and the dynamics of discussions in a given community informs the community manager and/or host w...
Chapter
Full-text available
Judging topical expertise of micro-blogger is one of the key challenges for information seekers when deciding which information sources to follow. However, it is unclear how useful different types of information are for people to make expertise judgments and to what extent their background knowledge influences their judgments. This study explored d...
Conference Paper
Full-text available
Social media has become an integral part of today's web and allows communities to share content and socialize. Understanding the factors that influence how communities evolve over time - for example how their social network and their content co-evolve - is an issue of both theoretical and practical relevance. This paper sets out to study the tempor...
Conference Paper
Full-text available
Social media has become an integral part of today's web and allows users to share content and socialize. Understanding the factors that influence how users evolve over time - for example how their social network and their contents co-evolve - is an issue of both theoretical and practical relevance. This paper sets out to study the temporal co-evolu...
Conference Paper
Full-text available
This paper presents an adaptable system for detecting trends based on the micro-blogging service Twitter, and sets out to explore to what extent such a tool can support researchers. Twitter has high uptake in the scientific community, but there is a need for a means of extracting the most important topics from a Twitter stream. There are too many t...