Vincent A. Traag's research while affiliated with Leiden University and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (82)
The study of biases, such as gender or racial biases, is an important topic in the social and behavioural sciences. However, the concept of bias is not always clearly defined in the literature. Definitions of bias are often ambiguous, or definitions are not provided at all. To study biases in a precise way, it is important to have a well-defined co...
Citations in science are being studied from several perspectives. On the one hand, there are approaches such as scientometrics and the science of science, which take a more quantitative perspective. In this chapter I briefly review some of the literature on citations, citation distributions and models of citations. These citations feature prominent...
Theoretical arguments and empirical investigations indicate that a high proportion of published findings are false or do not replicate. The current position paper provides a broad perspective on this scientific error, focusing both on reform history and on opportunities for future reform. Talking points are organised along four main themes: methodo...
This paper introduces a framework for understanding complex temporal interaction patterns in large-scale scientific collaboration networks. In particular, we investigate how two key concepts in science studies, scientific collaboration and scientific mobility, are related and possibly differ between fields. We do so by analyzing multilayer temporal...
Articles in high-impact journals are, on average, more frequently cited. But are they cited more often because those articles are somehow more “citable”? Or are they cited more often simply because they are published in a high-impact journal? Although some evidence suggests the latter the causal relationship is not clear. We here compare citations...
Most scientometricians reject the use of the journal impact factor for assessing individual articles and their authors. The well-known San Francisco Declaration on Research Assessment also strongly objects against this way of using the impact factor. Arguments against the use of the impact factor at the level of individual articles are often based...
As the COVID-19 pandemic unfolds, researchers from all disciplines are coming together and contributing their expertise. CORD-19, a dataset of COVID-19 and coronavirus publications, has been made available alongside calls to help mine the information it contains and to create tools to search it more effectively. We analyse the delineation of the pu...
[This corrects the article DOI: 10.1098/rsos.190207.].
In the past decades, many countries have started to fund academic institutions based on the evaluation of their scientific performance. In this context, peer review is often used to assess scientific performance. Bibliometric indicators have been suggested as an alternative. A recurrent question in this context is whether peer review and metrics te...
Examining coauthorship networks is key to study scientific collaboration patterns and structural characteristics of scientific communities. Here, we studied coauthorship networks of sociologists in Italy, using temporal and multi-level quantitative analysis. By looking at publications indexed in Scopus, we detected research communities among Italia...
As the COVID-19 pandemic unfolds, researchers from all disciplines are coming together and contributing their expertise. CORD-19, a dataset of COVID-19 and coronavirus publications, has recently been published alongside calls to help mine the information it contains, and to create tools to search it more effectively. Here, we focus on the delineati...
Citation networks of scientific publications offer fundamental insights into the structure and development of scientific knowledge. We propose a new measure, called intermediacy, for tracing the historical development of scientific knowledge. Given two publications, an older and a more recent one, intermediacy identifies publications that seem to p...
Articles in high-impact journals are by definition more highly cited on average. But are they cited more often because the articles are somehow "better"? Or are they cited more often simply because they appeared in a high-impact journal? Although some evidence suggests the latter the causal relationship is not clear. We here compare citations of pu...
This chapter is concerned with signed networks, where each link is associated with either a positive (+) or negative sign (‐). Blockmodeling, as a way of partitioning social networks, started with a clear substantive rationale expressed in terms of social roles. However, the availability of algorithms for partitioning (unsigned) networks, based on...
Minority integration is a highly contested topic in public debates, and assimilationist actors appear to have gained discursive ground. However, it remains difficult to accurately depict how power relations in debates change and evolve. In this study, the public debates on minority integration in Flanders and the Netherlands between 2006 and 2012 a...
Community detection is often used to understand the structure of large and complex networks. One of the most popular algorithms for uncovering community structure is the so-called Louvain algorithm. We show that this algorithm has a major defect that largely went unnoticed until now: the Louvain algorithm may yield arbitrarily badly connected commu...
When performing a national research assessment, some countries rely on citation metrics whereas others, such as the UK, primarily use peer review. In the influential Metric Tide report, a low agreement between metrics and peer review in the UK Research Excellence Framework (REF) was found. However, earlier studies observed much higher agreement bet...
Citation networks of scientific publications offer fundamental insights into the structure and development of scientific knowledge. We propose a new measure, called intermediacy, for tracing the historical development of scientific knowledge. Given two publications, an older and a more recent one, intermediacy identifies publications that seem to p...
Community detection is often used to understand the structure of large and complex networks. One of the most popular algorithms for uncovering community structure is the so-called Louvain algorithm. We show that this algorithm has a major defect that largely went unnoticed until now: the Louvain algorithm may yield arbitrarily badly connected commu...
When performing a national research assessment, some countries rely on citation metrics whereas others, such as the UK, primarily use peer review. In the influential Metric Tide report, a low agreement between metrics and peer review in the UK Research Excellence Framework (REF) was found. However, earlier studies observed much higher agreement bet...
Signed networks appear naturally in contexts where conflict or animosity is apparent. In this book chapter we review some of the literature on signed networks, especially in the context of partitioning. Most of the work is founded in what is known as structural balance theory. We cover the basic mathematical principles of structural balance theory....
Most scientometricians reject the use of the journal impact factor for assessing individual articles and their authors. The well-known San Francisco Declaration on Research Assessment also strongly objects against this way of using the impact factor. Arguments against the use of the impact factor at the level of individual articles are often based...
Protesters are usually young, relatively well educated, middle class people that are politically engaged. But where do protesters come from? We here show, based on mobile phone data, that distance is an important impedance to protest attendance. Most protesters come from nearby regions, suggesting distance forms an obstacle to participation. Althou...
This paper elaborates a relational approach to examine discursive contention. We develop a network method to identify groups forming through contentious interactions as well as relational measures of polarization, leadership, solidarity and various aspects of discursive power. The paper analyzes how an assimilationist movement confronted its advers...
Mobile phone data have been extensively used in the recent years to study social behavior. However, most of these studies are based on only partial data whose coverage is limited both in space and time. In this paper, we point to an observation that the bias due to the limited coverage in time may have an important influence on the results of the a...
Results for presidential candidates, election 2000.
Daily donors (a)—raw data is transparent, smoothed data is solid—and cumulative donors (b), probability effect of (c) donor degree, (d) common community donor degree, (e) donor communities, (f) source diversity and (g) previous donation. Logistic regression results (h) general effects, (i) network...
Logistic regression results for republican candidates.
Results for 2000–2012 for donation to the republican candidate as dependent variable.
(TIFF)
Results for presidential candidates, election 2004.
Daily donors (a)—raw data is transparent, smoothed data is solid— and cumulative donors (b), probability effect of (c) donor degree, (d) common community donor degree, (e) donor communities, (f) source diversity and (g) previous donation. Logistic regression results (h) general effects, (i) networ...
Results for presidential candidates, election 2008.
Daily donors (a)—raw data is transparent, smoothed data is solid— and cumulative donors (b), probability effect of (c) donor degree, (d) common community donor degree, (e) donor communities, (f) source diversity and (g) previous donation. Logistic regression results (h) general effects, (i) networ...
Results for parties, election 2008.
Daily donors (a)—raw data is transparent, smoothed data is solid— and cumulative donors (b), probability effect of (c) donor degree, (d) common community donor degree, (e) donor communities, (f) source diversity and (g) previous donation. Logistic regression results (h) general effects, (i) network effects for ne...
Results for presidential candidates, election 2012.
Daily donors (a)—raw data is transparent, smoothed data is solid— and cumulative donors (b), probability effect of (c) donor degree, (d) common community donor degree, (e) donor communities, (f) source diversity and (g) previous donation. Logistic regression results (h) general effects, (i) networ...
Cross-cutting logistic regression results for democratic party.
Results for 2000–2012 for donation to the democratic party as dependent variable, including cross-exposure effects (i.e. effect of exposure to republican donors on democratic donations).
(TIFF)
Logistic regression results for democratic candidates.
Results for 2000–2012 for donation to the democratic candidate as dependent variable.
(TIFF)
Results for parties, election 2012.
Daily donors (a)—raw data is transparent, smoothed data is solid— and cumulative donors (b), probability effect of (c) donor degree, (d) common community donor degree, (e) donor communities, (f) source diversity and (g) previous donation. Logistic regression results (h) general effects, (i) network effects for ne...
Results for parties, election 2004.
Daily donors (a)—raw data is transparent, smoothed data is solid— and cumulative donors (b), probability effect of (c) donor degree, (d) common community donor degree, (e) donor communities, (f) source diversity and (g) previous donation. Logistic regression results (h) general effects, (i) network effects for ne...
Logistic regression results for democratic party.
Results for 2000–2012 for donation to the democratic party as dependent variable.
(TIFF)
Cross-cutting logistic regression results for republican party.
Results for 2000–2012 for donation to the republican party as dependent variable, including cross-exposure effects (i.e. effect of exposure to democratic donors on republican donations).
(TIFF)
Logistic regression results for republican party.
Results for 2000–2012 for donation to the republican party as dependent variable.
(TIFF)
Money is central in US politics, and most campaign contributions stem from a tiny, wealthy elite. Like other political acts, campaign donations are known to be socially contagious. We study how campaign donations diffuse through a network of more than 50000 elites and examine how connectivity among previous donors reinforces contagion. We find that...
Cross-cutting logistic regression results for republican candidates.
Results for 2000–2012 for donation to the republican candidate as dependent variable, including cross-exposure effects (i.e. effect of exposure to democratic donors on republican donations).
(TIFF)
Results for parties, election 2000.
Daily donors (a)—raw data is transparent, smoothed data is solid— and cumulative donors (b), probability effect of (c) donor degree, (d) common community donor degree, (e) donor communities, (f) source diversity and (g) previous donation. Logistic regression results (h) general effects, (i) network effects for ne...
Cross-cutting logistic regression results for democratic candidates.
Results for 2000–2012 for donation to the democratic candidate as dependent variable, including cross-exposure effects (i.e. effect of exposure to republican donors on democratic donations).
(TIFF)
Data for replication.
The Excel file donations.xls contains detailed donation records for the presidential campaigns for studying whether the complex contagion of donations is driven by cohesive reinforcement or independent reinforcement. It also contains the aggregate statistics for other campaigns to predict the total amount of money raised. The...
Social networks have been of much interest in recent years. We here focus on a network structure derived from co-occurrences of people in traditional newspaper media. We find three clear deviations from what can be expected in a random graph. First, the average degree in the empirical network is much lower than expected, and the average weight of a...
This paper presents a new method of identifying a nation's political elite using computational techniques on digitised newspaper articles. It begins by describing the three most widely used methods of identifying political elites: positional, decisional and reputational. It then introduces the "reported elite method", exploring the kinds of elites...
Many complex networks exhibit a modular structure of densely connected groups
of nodes. Usually, such a modular structure is uncovered by the optimisation of
some quality function. Although flawed, Modularity remains one of the most
popular quality functions. The Louvain algorithm was originally developed for
optimising Modularity, but has been app...
Nodes in real-world networks are repeatedly observed to form dense clusters,
often referred to as communities. Methods to detect these groups of nodes
usually maximize an objective function, which implicitly contains the
definition of a community. We here analyze a recently proposed measure called
Surprise, which assesses the quality of the partiti...
This paper introduces the Elite Network Shifts (ENS) project to the Asian Studies community where computational techniques are used with digitised newspaper articles to describe changes in relations among Indonesian political elites. Reflecting on how "political elites" and "political relations" are understood by the elites, as well as across the d...
We present a new computational methodology to identify national political elites, and demonstrate it for Indonesia. On the basis that elites have an "organised capacity to make real and continuing political trouble", we identify them as those individuals who occur most frequently in a large corpus of politically-oriented newspaper articles. Doing t...
Studies of human attention dynamics analyses how attention is focused on
specific topics, issues or people. In online social media, there are clear
signs of exogenous shocks, bursty dynamics, and an exponential or powerlaw
lifetime distribution. We here analyse the attention dynamics of traditional
media, focussing on co-occurrence of people in new...
The rise of social media allowed for rich analyses of their content and their
network structure. As traditional media (i.e. newspapers and magazines) are
being digitized, similar analyses can be undertaken. This provides a glimpse of
the elite, as the news mostly revolves around the more influential members of
society. We here focus on a network st...
Although the field of community detection is relatively young, already quite some methods and algorithms have been introduced in the literature. In this chapter, we will review several of these methods, and provide some algorithms for implementing these methods. We will derive most of these methods from a relatively general framework, to which we r...
The field of community detection has a short but rich history, and communities have been found to be useful in many different settings. We here review two applications of community detection, and in the process we will show how previously discussed problems appear and are addressed.
The distinction between positive and negative links is not often made. Nonetheless, it can be essential for understanding the network structure. We here review an old theory from sociology, known as social balance theory. The idea is similar to the old adage of “the enemy of my enemy is my friend". We will derive some of the classical results, whic...
Most methods for community detection assume that the weight of links is positive. However, there are many situations in which it is natural to use negative weights, for example, for modelling conflict or hatred, or correlations. We briefly address this issue in this chapter, and see that some methods are better able to cope with negative weights th...
Some multi-resolution methods may be able to overcome the issue of the resolution-limit. Nonetheless, it remains difficult to find “meaningful” or “good” resolution values. In addition, it is not always clear whether the observed partition is really different from what can be observed in a random graph. We here introduce the notion of the significa...
The evolution of cooperation is a long-standing problem that has baffled biologists and sociologists alike. The problem is that not cooperating often allows for a higher immediate benefit, so why should cooperation take place? Nonetheless, we often observe cooperative behaviour, especially so in humans. One of the theories is that people use a repu...
Although modularity has been one of the most frequently used methods the past decade, it suffers from some drawbacks. We will review these drawbacks here, and see whether the other methods reviewed in this thesis suffer from similar drawbacks. One of the most well-known problems is that of the resolution-limit, and we will introduce a more formal a...
Social balance theory states that signed social networks should tend to split in two factions, each faction having only positive links within and negative links between the two. Although the theory has long been concerned with finding evidence of such groupings in social networks, little attention has been devoted to what dynamics may give rise to...
In many online settings, such as in online markets or peer-to-peer applications, we want to preferably deal with trustworthy partners. Usually, by letting users rate each other, some indication of trustworthiness is obtained. However, the ratings of users that are themselves not trustworthy should not be trusted. We here suggest a method for solvin...
We argue that theories regarding the relationship between trade and conflict could benefit greatly from accounting for the networked structure of international trade. Indirect trade relations reduce the probability of conflict by creating (1) opportunity costs of conflict beyond those reflected by direct trade ties; and (2) negative externalities f...
Many complex networks show signs of modular structure, uncovered by community detection. Although many methods succeed in revealing various partitions, it remains difficult to detect at what scale some partition is significant. This problem shows foremost in multi-resolution methods. We here introduce an efficient method for scanning for resolution...
Results including type A, B and defectors.
(TIFF)
Phase portrait of system S12-S13. Circular orbits in the upper half plane (a >0) are traversed counter clockwise, whereas circular orbits in the lower half plane (a <0) are traversed clockwise.
(TIFF)
Results different intensities of selection.
(TIFF)
Proofs and details of statements in the main paper.
(PDF)
Social life coalesces into communities through cooperation and conflict. As a case in point, Shwed and Bearman (2010) studied consensus and contention in scientific communities. They used a sophisticated modularity method to detect communities on the basis of scientific citations, which they then interpreted as directed positive network ties. They...
Mobile phone datasets allow for the analysis of human behavior on an
unprecedented scale. The social network, temporal dynamics and mobile behavior
of mobile phone users have often been analyzed independently from each other
using mobile phone datasets. In this article, we explore the connections
between various features of human behavior extracted...
Social networks with positive and negative links often split into two
antagonistic factions. Examples of such a split abound: revolutionaries versus
an old regime, Republicans versus Democrats, Axis versus Allies during the
second world war, or the Western versus the Eastern bloc during the Cold War.
Although this structure, known as social balance...
The unprecedented amount of data from mobile phones creates new possibilities to analyze various aspects of human behavior. Over the last few years, much effort has been devoted to studying the mobility patterns of humans. In this paper we will focus on unusually large gatherings of people, i.e. unusual social events. We introduce the methodology o...
Detecting communities in large networks has drawn much attention over the years. While modularity remains one of the more popular methods of community detection, the so-called resolution limit remains a significant drawback. To overcome this issue, it was recently suggested that instead of comparing the network to a random null model, as is done in...
Explaining how cooperation can emerge, and persist over time in various species is a prime challenge for both biologists and social scientists. Whereas cooperation in non-human species might be explained through mechanisms such as kinship selection or reciprocity, this is usually regarded as insufficient to explain the extent of cooperation observe...
Networks have attracted a great deal of attention the last decade, and play an important role in various scientific disciplines.
Ranking nodes in such networks, based on for example PageRank or eigenvector centrality, remains a hot topic. Not only does
this have applications in ranking web pages, it also allows peer-to-peer systems to have effectiv...
Detecting communities in complex networks accurately is a prime challenge, preceding further analyses of network characteristics and dynamics. Until now, community detection took into account only positively valued links, while many actual networks also feature negative links. We extend an existing Potts model to incorporate negative links as well,...
Citations
... In this case, we should not normalise citations for the journal J, because doing so would most likely make the normalised citations a less accurate indicator for quality Q, not a more accurate indicator for quality Q. In fact, based on this observation, the journal J might be a more accurate indicator of Q than the citations C, as suggested by Waltman and Traag (2020). Now suppose that author prestige A and departmental prestige P affects acceptance, so that A → J and P → J, for which there is some evidence, as we saw in section II A. If we assume that A and P are independent of Q we might want to normalise for those effects, so that the normalised citations are not biased by A or P , but a more accurate reflection of Q. ...
Reference: Citation models and research evaluation
... However, this conclusion is problematic if there is a causal effect of where a paper is published on how frequently it is cited. Being published in a high-ranked journal will affect the subsequent citations (Traag, 2021), and the citations do not necessarily reflect whether peer review is predictive, the citations just reflect the causal effect of being published in a certain venue. A similar problem plays in a recent analysis of the predictive validity of peer review when highlighting publications in a journal (Antonoyiannakis, 2021). ...
Reference: Citation models and research evaluation
... Bibliometric studies have been widely applied at multiple scholarly areas with few successfully guiding decision-making across the respective thematic fields (19)(20)(21). At present, a plethora of scientometric and bibliometric studies have been published aiming at gaining more insights on the landscape of publications related to COVID-19 (13,(22)(23)(24)(25)(26). However, only a small portion of bibliometric analyses have explored temporally the pandemic in terms of research output during the first months (27)(28)(29)(30), and additionally, the economic aspect driving scholarly productivity has not been systematically examined. ...
... Ce faisant, elle empêcherait la structuration de la sociologie italienne par de véritables controverses scientifiques et oppositions théoriques, participant d'une certaine stérilisation de la discipline. Et ce d'autant plus que, comme l'ont confirmé les principales études quantitatives sur le sujet, fondées sur l'analyse des citations mutuelles et des cosignatures entre universitaires italiens(Riviera, 2015 ;Akbaritabar et al., 2020), la division de l'espace national de la sociologie en composantes recoupe en partie celle liée aux spécialisations thématiques : le Mi-To étant par exemple surreprésenté parmi les sociologues des inégalités, de l'économie, du politique et des mouvements sociaux ; les « Catholiques » parmi les sociologues de la culture, des migrations, de l'intervention sociale et de la communication ; et les « Romains » parmi les méthodologues. ...
... Wenn man nach Publikationen sucht, die irgendwo im Text Begriffe erwähnen, die mit dem Virus zusammenhängen, dann stößt man auf 58.000 Veröffentlichungen. Dies sind im Vergleich zu früheren Epidemien mehrere Tausend Publikationen mehr [301]. In der Mehrzahl handelte es sich bei den Publikationen nicht um Studien mit Daten, sondern um Meinungsäußerungen [302] -nicht überraschend angesichts der kaum vorhandenen empirischen Basis in der Frühzeit der Pandemie. ...
... El análisis de las citas como herramienta para evaluar una revista científica fue desarrollado por Eugene Garfield cuando trabajaba en el Institute for Scientific Information (ISI 'Instituto para Información Científica), e introdujo la base de datos WoS (perteneciente al ISI) y publicó el Journal Citation Reports (JCR 'Informes de citas en revistas científicas) en 1976. [10] En un informe suyo de hace más de 50 años, se refirió al análisis de citas como un medio válido y valioso para crear descripciones históricas exactas de campos científicos [11]. Las citas son siempre una influencia del área de conocimiento, en especial para las áreas de gran impacto científico; de donde es posible determinar los estudios más relevantes [12], por lo que para conocer el impacto de una revista, se debe analizar las citas de sus publicaciones. ...
... The line (deletion) index of balance measures the minimum number of links whose removal results in balance. Since then, apart from subsequence works focusing on this index [280,281], many other approaches have been proposed, such as measures of balance in terms of simple cycles [279,[282][283][284][285], in terms of inconsistent links within the signed blockmodel framework [286,287], walk-based measures [288][289][290], energy-based measures [291], measures based on algebraic topology tools [292], and on solution of correlation clustering problems [293,294]. For a recent comparison of these measures; see [295,296]. ...
... Single-cell clustering is always an important work in the field of single-cell analysis, which allows us to infer the identity of cells. PhenoGraph is applied as the clustering method in CITEMO framework, which uses the Leiden algorithm as an emerging clustering method designed specifically for singlecell data [77,78]. Especially, PhenoGraph is optimized for the clusters with broken links in Leiden clustering distribution, giving a more reasonable clustering result with more subpopulations. ...
... Studies have reported wildly varying correlations, ranging from as low as 0.3 to as high as 0.97. There are two major factors that explain the differences in these results (Traag and Waltman, 2019). The first factor is what level of aggregation is being studied. ...
Reference: Citation models and research evaluation
... Literature [10] adopted human body detection based on FAST-CNN. Literature [11] proposed an attitude partitioning network for node detection and intensive regression. ...