
Frank TakesLeiden University | LEI · Leiden Institute of Advanced Computer Science
Frank Takes
Visit Google Scholar for complete information
About
84
Publications
15,678
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,161
Citations
Introduction
Skills and Expertise
Publications
Publications (84)
Higher-order networks effectively represent complex systems with group interactions. Existing methods usually overlook the relative contribution of group interactions (hyperlinks) of different sizes to the overall network structure. Yet, this has many important applications, especially when the network has meaningful node labels. In this work, we p...
Cliques, groups of fully connected nodes in a network, are often used to study group dynamics of complex systems. In real-world settings, group dynamics often have a temporal component. For example, conference attendees moving from one group conversation to another. Recently, maximal clique enumeration methods have been introduced that add temporal...
This paper considers the European transfer market for professional football players as a network to study the relation between a team’s position in this network and performance in its domestic league. Our analysis is centered on eight top European leagues. The market in each season is represented as a weighted directed network capturing the transfe...
In this paper we introduce a general version of the anonymization problem in social networks, in which the goal is to maximize the number of anonymous nodes by altering a given graph. We define three variants of this optimization problem, being full, partial and budgeted anonymization. In each, the objective is to maximize the number of k-anonymous...
In this work we focus on identifying key players in dark net cryptomarkets that facilitate online trade of illegal goods. Law enforcement aims to disrupt criminal activity conducted through these markets by targeting key players vital to the market’s existence and success. We particularly focus on detecting successful vendors responsible for the ma...
Our perceptions are shaped by the social networks we are embedded in. Despite the acknowledged influence of close contacts on how we perceive the world, the role of the broader social environment remains opaque. Here, we leverage a unique combination of population-scale social network and survey data on perceptions of immigration. We find that both...
Privacy-aware sharing of network data is a difficult task due to the interconnectedness of individuals in networks. An important part of this problem is the inherently difficult question of how in a particular situation the privacy of an individual node should be measured. To that end, in this paper we propose a set of aspects that one should consi...
The dominance of online social media data as a source of population-scale social network studies has recently been challenged by networks constructed from government-curated register data. In this paper, we investigate how the two compare, focusing on aggregations of the Dutch online social network (OSN) Hyves and a register-based social network (R...
This paper proposes a novel framework for empirically assessing the effect of network characteristics on the performance of pretrained link prediction models. In link prediction, the task is to predict missing or future links in a given network dataset. We focus on the pretrained setting, in which such a predictive model is trained on one dataset,...
Ensuring privacy of individuals is of paramount importance to social network analysis research. Previous work assessed anonymity in a network based on the non-uniqueness of a node’s ego network. In this work, we show that this approach does not adequately account for the strong de-anonymizing effect of distant connections. We first propose the use...
When dealing with sensitive data in automated data-driven decision-making, an important concern is to learn predictors with high performance towards a class label, whilst minimising for the discrimination towards any sensitive attribute, like gender or race, induced from biased data. Hybrid tree optimisation criteria have been proposed which combin...
In this work we focus on identifying key players in dark net cryptomarkets. Law enforcement aims to disrupt criminal activity conducted through these markets by targeting key players vital to the market's existence and success. We particularly focus on detecting successful vendors responsible for the majority of illegal trade. Our methodology aims...
Ensuring privacy of individuals is of paramount importance to social network analysis research. Previous work assessed anonymity in a network based on the non-uniqueness of a node's ego network. In this work, we show that this approach does not adequately account for the strong de-anonymizing effect of distant connections. We first propose the use...
This paper proposes methods for efficiently computing the anonymity of entities in networks. We do so by partitioning nodes into equivalence classes where a node is k -anonymous if it is equivalent to k − 1 other nodes. This assessment of anonymity is crucial when one wants to share data and must ensure the anonymity of entities represented is comp...
Large-scale human social network structure is typically inferred from digital trace samples of online social media platforms or mobile communication data. Instead, here we investigate the social network structure of a complete population, where people are connected by high-quality links sourced from administrative registers of family, household, wo...
We propose a social network-aware approach to studying socio-economic segregation. The key question that we address is whether patterns of segregation are more pronounced in social networks than the common spatial neighborhood-focused manifestations of segregation. We, therefore, conduct a population-scale social network analysis to study socio-eco...
Circulation is the characteristic feature of successful currency systems, from community currencies to cryptocurrencies to national currencies. In this paper, we propose a network analysis approach especially suited for studying circulation given a system’s digital transaction records. Sarafu is a digital community currency that was active in Kenya...
Cargo ships navigating global waters are required to be sufficiently safe and compliant with international treaties. Governmental inspectorates currently assess in a rule-based manner whether a ship is potentially noncompliant and thus needs inspection. One of the dominant ship characteristics in this assessment is the ‘colour’ of the flag a ship i...
Large-scale human social network structure is typically inferred from digital trace samples of online social media platforms or mobile communication data. Instead, here we investigate the social network structure of a complete population, where people are connected by high-quality links sourced from administrative registers of family, household, wo...
The velocity of money is an important driver of inflation that is conventionally measured as an average for an economy as a whole. While easy to calculate from macroeconomic aggregates, such measures overlook possibly relevant heterogeneity between payment systems, across regions, and intrinsic to spending patterns. This paper proposes a new measur...
Circulation is the characteristic feature of successful currency systems, from community currencies to cryptocurrencies to national currencies. In this paper, we propose a network analysis methodology for studying circulation given a system's digital transaction records. This is applied to Sarafu, a digital community currency active in Kenya over a...
Community detection is a well-established method for studying the meso-scale structure of social networks. Applying a community detection algorithm results in a division of a network into communities that is often used to inspect and reason about community membership of specific nodes. This micro-level interpretation step of community structure is...
Corporations seek various relationships, such as board interlocks, with other firms to reduce resource dependencies. The consistent theoretical expectation and empirical finding that physical proximity is an important driver for board interlock formation is seemingly at odds with the emerging and growing literature on transnational board interlock...
This paper investigates the stability and evolution of the world stage of global science at the city level by analyzing changes in co-authorship network centrality rankings over time. Driven by the problem that there exists no consensus in the literature on how the spatial unit “city” should be defined, we first propose a new approach to delineate...
The transnationalization of economic activities has fundamentally altered the world. One of the consequences that has intrigued scholars is the formation of a transnational corporate elite. While the literature tends to focus on the topology of the transnational board interlock network, little is known about its driving mechanisms. This article ask...
Link prediction is a well-studied technique for inferring the missing edges between two nodes in some static representation of a network. In modern day social networks, the timestamps associated with each link can be used to predict future links between so-far unconnected nodes. In these so-called temporal networks, we speak of temporal link predic...
When dealing with sensitive data in automated data-driven decision-making, an important concern is to learn predictors with high performance towards a class label, whilst minimising for the discrimination towards any sensitive attribute, like gender or race, induced from biased data. A few hybrid tree optimisation criteria exist that combine classi...
This paper introduces a framework for understanding complex temporal interaction patterns in large-scale scientific collaboration networks. In particular, we investigate how two key concepts in science studies, scientific collaboration and scientific mobility, are related and possibly differ between fields. We do so by analyzing multilayer temporal...
Production networks are integral to economic dynamics, yet dis-aggregated network data on inter-firm trade is rarely collected and often proprietary. Here we situate company-level production networks within a wider space of networks that are different in nature, but similar in local connectivity structure. Through this lens, we study a regional and...
What do football passes and financial transactions have in common? Both are networked walk processes that we can observe, where records take the form of timestamped events that move something tangible from one node to another. Here we propose an approach to analyze this type of data that extracts the actual trajectories taken by the tangible items...
Production networks are integral to economic dynamics, yet dis-aggregated network data on inter-firm trade is rarely collected and often proprietary. Here we situate company-level production networks among networks from other domains according to their local connectivity structure. Through this lens, we study a regional and a national network of in...
Transit of wasteful materials within the European Union is highly regulated through a system of permits. Waste processing costs vary greatly depending on the waste category of a permit. Therefore, companies may have a financial incentive to allege transporting waste with erroneous categorisation. Our goal is to assist inspectors in selecting potent...
In link prediction, the goal is to predict which links will appear in the future of an evolving network. To estimate the performance of these models in a supervised machine learning model, disjoint and independent train and test sets are needed. However, objects in a real-world network are inherently related to each other. Therefore, it is far from...
This book contains a selection of the best papers of the 32nd Benelux Conference on Artificial Intelligence, BNAIC/Benelearn 2020, held in Leiden, The Netherlands, in November 2020. Due to the COVID-19 pandemic the conference was held online.
The 12 papers presented in this volume were carefully reviewed and selected from 41 regular submissions....
Community detection is a well established method for studying the meso scale structure of social networks. Applying a community detection algorithm results in a division of a network into communities that is often used to inspect and reason about community membership of specific nodes. This micro level interpretation step of community structure is...
The long tradition of scholarly work on corporate interlocks has left us with competing theoretical frameworks on the causes of interlock networks. Board interlocks are studied either as means to overcome the resource dependence of corporations or as a group cohesion mechanism of business elites. This contrast is due to an empirical divide of the l...
In this study we use machine learning to perform explainable business sector prediction from financial statements. Financial statements are a valuable source of information on the financial state and performance of firms. Recently, large-scale data on financial statements has become available in the form of open data sets. Previous work on such dat...
The goal of this paper is to learn the dynamics of truck co-driving behaviour. Understanding this behaviour is important because co-driving has a potential positive impact on the environment. In the so-called co-driving network, trucks are nodes while links indicate that two trucks frequently drive together. To understand the network’s dynamics, we...
Abstract This paper proposes novel algorithms for efficiently counting complex network motifs in dynamic networks that are changing over time. Network motifs are small characteristic configurations of a few nodes and edges, and have repeatedly been shown to provide insightful information for understanding the meso-level structure of a network. Here...
This paper deals with the trick-taking game of Klaverjas, in which two teams of two players aim to gather as many high valued cards for their team as possible. We propose an efficient encoding to enumerate possible configurations of the game, such that subsequently -search can be employed to effectively determine whether a given hand of cards is wi...
The overwhelming amount of network data that is nowadays available, leads to an increased demand for techniques that automatically identify anomalous nodes. Examples are network intruders in physical networks or spammers spreading unwanted advertisements in online social networks. Existing methods typically identify network anomalies from a local p...
This paper proposes a novel approach to count temporal motifs in multilayer complex networks. Network motifs, i.e., small characteristic patterns of a handful of nodes and edges, have repeatedly been shown to be instrumental in understanding real-world complex systems. However, exhaustively enumerating these motifs is computationally infeasible for...
This paper studies online child exploitation networks in which users communicate about illegal child pornography material. Law enforcement agencies are extremely interested in better understanding these networks and their key players. We utilize unique real-world network data sets collected from two different online discussion forums on the dark ne...
This paper studies the complex network structure of software design networks. In a software design network, each node is a class (a specific part of a piece of software) and each link represents a software code-related dependency between two classes. This work provides two main contributions. First, we reveal how typical software networks exhibit a...
This paper examines the co-driving behavior of truck drivers using network analysis. From a unique spatiotemporal dataset encompassing more than 10 million measurements of trucks passing 17 different highway locations in the Netherlands, we extract a so-called co-driving network. In this network, nodes are truck drivers and edges represent pairs of...
The main topic of this paper is the discovery of motifs in multiplex corporate networks. Network motifs are small subgraphs occurring at significantly higher numbers than in similar random networks. They can be seen as the building blocks of a complex network. In real-world network data, multiple types of (possibly overlapping) relationships may be...
In corporate networks, firms are connected through links of corporate ownership and shared directors, connecting the control over major economic actors in our economies in meaningful and consequential ways. Most research thus far focused on the connectedness of firms as a result of one particular link type, analyzing node-specific metrics or global...
Cryptocurrencies such as Bitcoin and Ethereum have recently gained a lot of popularity, not only as a digital form of currency but also as an investment vehicle. Online marketplaces and exchanges allow users across the world to convert between dozens of different cryptocurrencies and regular currencies such as euros or dollars. Due to the novelty o...
Network data on connections between corporate actors and entities – for instance through co-ownership ties or elite social networks – are increasingly available to researchers interested in probing the many important questions related to the study of modern capitalism. Given the analytical challenges associated with the nature of the subject matter...
Multinational corporations use highly complex structures of parents and subsidiaries to organize their operations and ownership. Offshore Financial Centers (OFCs) facilitate these structures through low taxation and lenient regulation, but are increasingly under scrutiny, for instance for enabling tax avoidance. Therefore, the identification of OFC...
Multinational corporations use highly complex structures of parents and subsidiaries to organize their operations and ownership. Offshore Financial Centers (OFCs) facilitate these structures through low taxation and lenient regulation, but are increasingly under scrutiny, for instance for enabling tax avoidance. Therefore, the identification of OFC...
Nowadays, social networks of ever increasing size are studied by researchers from a range of disciplines. The data underlying these networks is often automatically gathered from API's, websites or existing databases. As a result, the quality of this data is typically not manually validated, and the resulting networks may be based on false, biased o...
Nowadays, social networks of ever increasing size are studied by researchers from a range of disciplines. The data underlying these networks is often automatically gathered from API's, websites or existing databases. As a result, the quality of this data is typically not manually validated, and the resulting networks may be based on false, biased o...
Corporations across the world are highly interconnected in a large global network of corporate control. This paper investigates the global board interlock network, covering 400,000 firms linked through 1,700,000 edges representing shared directors between these firms. The main focus is on the concept of centrality, which is used to investigate the...
Data that involves some sort of relationship or interaction can be represented, modelled and analyzed using the notion of a network. To understand the dynamics of networks, the link prediction problem is concerned with predicting the evolution of the topology of a network over time. Previous work in this direction has largely focussed on finding an...
Corporations across the world are highly interconnected in a large global network of corporate control. This paper investigates the global board interlock network, covering 400,000 firms linked through 1,700,000 edges representing shared directors between these firms. The main focus is on the concept of centrality, which is used to investigate the...
We argue against the dominant a priori distinction between national and transnational in the study of corporate elites. Business elites reconfigure their locus of organization over time, from the city level, to the national level, and beyond. We ask what the current level of elite organization is and propose a novel theoretical and empirical approa...
A key debate on the merits and consequences of globalisation asks to what extent we have moved to a multipolar global political economy. Here we investigate this issue through the properties and topologies of corporate elite networks and ask: what is the community structure of the global corporate elite? In order to answer this question, we analyse...
We discuss very effcient diameter computation algorithms and applied them on huge graphs.
In this paper, we propose a new algorithm that computes the radius and the diameter of a weakly connected digraph , by finding bounds through heuristics and improving them until they are validated. Although the worst-case running time is , we will experimentally show that it performs much better in the case of real-world networks, finding the radiu...
In this paper, we will propose a new algorithm that computes the radius and the diameter of a graph G = (V,E), by finding bounds through heuristics and improving them until exact values can be guaranteed. Although the worst-case running time is \(\mathcal{O}(|V|\cdot |E|)\), we will experimentally show that, in the case of real-world networks, it p...
In this paper we propose a highly parallel GPU-based bounding algorithm for computing the exact diameter of large real-world sparse graphs. The diameter is defined as the length of the longest shortest path between vertices in the graph, and serves as a relevant property of all types of graphs that are nowadays frequently studied. Examples include...
This paper studies patterns occurring in user-generated click paths within the online encyclopedia Wikipedia. The click path data originates from over seven million goal-oriented clicks gathered from the Wiki Game, an online game in which the goal is to find a path between two given random Wikipedia articles. First we propose to use node-based path...
The eccentricity of a node in a graph is defined as the length of a longest shortest path starting at that node. The eccentricity distribution over all nodes is a relevant descriptive property of the graph, and its extreme values allow the derivation of measures such as the radius, diameter, center and periphery of the graph. This paper describes t...
The eccentricity of a node in a graph is defined as the length of a longest shortest path starting at that node. The eccentricity distribution over all nodes is a relevant descriptive property of the graph, and its extreme values allow the derivation of measures such as the radius, diameter, center and periphery of the graph. This paper describes t...
In this paper we describe the structural characteristics of prominent actors that reside within an online social network. We will show how structural properties can be used in a classification algorithm based on biased random walks for distinguishing between prominent and regular nodes in a social network. The effectiveness of our approach is demon...
This ongoing research addresses the use of page ranking for computing relatedness coef-ficients between pairs of nodes in a directed graph, based on their edge structure. A novel, hybrid algorithm is proposed for a complete assessment of nodes and their connecting edges, which is then applied to a practical application, namely a recommender system...
In this paper we discuss the task of discovering topical influence within the online social network TWITTER. The main goal of this research is to discover who the influential users are with respect to a certain given topic. For this research we have sampled a portion of the TWITTER social graph, from which we have distilled topics and topical activ...
This paper introduces a set of classification techniques for determining the difficulty - for a human - of path traversal in an information network. In order to ensure the generalizability of our approach, we do not use ontologies or concepts of expected semantic relatedness, but rather focus on local and global structural graph properties and meas...
In this paper we present a novel approach to determine the exact diameter (longest shortest path length) of large graphs, in particular of the nowadays frequently studied small world networks. Typical examples include social networks, gene networks, web graphs and internet topology networks. Due to complexity issues, the diameter is often calculate...