Tina Eliassi-Rad

Tina Eliassi-Rad
  • Northeastern University

About

168
Publications
27,981
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,027
Citations
Current institution
Northeastern University

Publications

Publications (168)
Article
Hypergraphs, which belong to the family of higher-order networks, are a natural and powerful choice for modeling group interactions in the real world. For example, when modeling collaboration networks, which may involve not just two but three or more people, the use of hypergraphs allows us to explore beyond pairwise (dyadic) patterns and capture g...
Preprint
Full-text available
Analysis of single-cell RNA sequencing data is often conducted through network projections such as coexpression networks, primarily due to the abundant availability of network analysis tools for downstream tasks. However, this approach has several limitations: loss of higher-order information, inefficient data representation caused by converting a...
Preprint
Full-text available
Classifying genome sequences based on metadata has been an active area of research in comparative genomics for decades with many important applications across the life sciences. Established methods for classifying genomes can be broadly grouped into sequence alignment-based and alignment-free models. Conventional alignment-based models rely on geno...
Preprint
Full-text available
Machine learning models for graphs in real-world applications are prone to two primary types of uncertainty: (1) those that arise from incomplete and noisy data and (2) those that arise from uncertainty of the model in its output. These sources of uncertainty are not mutually exclusive. Additionally, models are susceptible to targeted adversarial a...
Preprint
Full-text available
The limited amount of data available renders it challenging to characterize which biological processes are relevant to a rare disease. Hence, there is a need to leverage the knowledge of disease pathogenesis and treatment from the wider disease landscape to understand rare disease mechanisms. Furthermore, it is well understood that rare disease dis...
Preprint
Full-text available
Node embedding algorithms produce low-dimensional latent representations of nodes in a graph. These embeddings are often used for downstream tasks, such as node classification and link prediction. In this paper, we investigate the following two questions: (Q1) Can we explain each embedding dimension with human-understandable graph features (e.g. de...
Preprint
Full-text available
Human-AI coevolution, defined as a process in which humans and AI algorithms continuously influence each other, increasingly characterises our society, but is understudied in artificial intelligence and complexity science literature. Recommender systems and assistants play a prominent role in human-AI coevolution, as they permeate many facets of da...
Article
Full-text available
Scalable addressing of high-dimensional constrained combinatorial optimization problems is a challenge that arises in several science and engineering disciplines. Recent work introduced novel applications of graph neural networks for solving quadratic-cost combinatorial optimization problems. However, effective utilization of models such as graph n...
Preprint
Full-text available
Unraveling the human interactome to understand biological processes and uncover disease-specific patterns hinges on accurate protein-protein interaction (PPI) predictions. However, challenges persist in machine learning (ML) models due to a scarcity of quality hard negative samples, shortcut learning, and limited generalizability to novel proteins....
Article
Full-text available
Vertex classification using graph convolutional networks is susceptible to targeted poisoning attacks, in which both graph structure and node attributes can be changed in an attempt to misclassify a target node. This vulnerability decreases users' confidence in the learning method and can prevent adoption in high-stakes contexts. Defenses have been...
Article
Full-text available
The COVID-19 pandemic offers an unprecedented natural experiment providing insights into the emergence of collective behavioral changes of both exogenous (government mandated) and endogenous (spontaneous reaction to infection risks) origin. Here, we characterize collective physical distancing—mobility reductions, minimization of contacts, shortenin...
Article
Full-text available
Here we represent human lives in a way that shares structural similarity to language, and we exploit this similarity to adapt natural language processing techniques to examine the evolution and predictability of human lives based on detailed event sequences. We do this by drawing on a comprehensive registry dataset, which is available for Denmark a...
Preprint
Full-text available
Scalable addressing of high dimensional constrained combinatorial optimization problems is a challenge that arises in several science and engineering disciplines. Recent work introduced novel application of graph neural networks for solving polynomial-cost unconstrained combinatorial optimization problems. This paper proposes a new framework, calle...
Article
Identifying shortest paths between nodes in a network is a common graph analysis problem that is important for many applications involving routing of resources. An adversary that can manipulate the graph structure could alter traffic patterns to gain some benefit (e.g., make more money by directing traffic to a toll road). This paper presents the F...
Article
Full-text available
In this work, we explore multiplex graph (networks with different types of edges) generation with deep generative models. We discuss some of the challenges associated with multiplex graph generation that make it a more difficult problem than traditional graph generation. We propose TenGAN, the first neural network for multiplex graph generation, wh...
Article
Full-text available
Self-propagating malware (SPM) is responsible for large financial losses and major data breaches with devastating social impacts that cannot be understated. Well-known campaigns such as WannaCry and Colonial Pipeline have been able to propagate rapidly on the Internet and cause widespread service disruptions. To date, the propagation behavior of SP...
Preprint
Full-text available
Vertex classification -- the problem of identifying the class labels of nodes in a graph -- has applicability in a wide variety of domains. Examples include classifying subject areas of papers in citation networks or roles of machines in a computer network. Vertex classification using graph convolutional networks is susceptible to targeted poisonin...
Preprint
Full-text available
When dealing with large graphs, community detection is a useful data triage tool that can identify subsets of the network that a data analyst should investigate. In an adversarial scenario, the graph may be manipulated to avoid scrutiny of certain nodes by the analyst. Robustness to such behavior is an important consideration for data analysts in h...
Preprint
Full-text available
Link prediction is a crucial task in graph machine learning with diverse applications. We explore the interplay between node attributes and graph topology and demonstrate that incorporating pre-trained node attributes improves the generalization power of link prediction models. Our proposed method, UPNA (Unsupervised Pre-training of Node Attributes...
Preprint
Full-text available
The rise of large-scale socio-technical systems in which humans interact with artificial intelligence (AI) systems (including assistants and recommenders, in short AIs) multiplies the opportunity for the emergence of collective phenomena and tipping points, with unexpected, possibly unintended, consequences. For example, navigation systems' suggest...
Preprint
Full-text available
Identifying shortest paths between nodes in a network is an important task in applications involving routing of resources. Recent work has shown that a malicious actor can manipulate a graph to make traffic between two nodes of interest follow their target path. In this paper, we develop a defense against such attacks by modifying the weights of th...
Article
Full-text available
The brain is a complex system comprising a myriad of interacting elements, posing significant challenges in understanding its structure, function, and dynamics. Network science has emerged as a powerful tool for studying such intricate systems, offering a framework for integrating multiscale data and complexity. Here, we discuss the application of...
Preprint
Full-text available
The brain is a complex system comprising a myriad of interacting elements, posing significant challenges in understanding its structure, function, and dynamics. Network science has emerged as a powerful tool for studying such intricate systems, offering a framework for integrating multiscale data and complexity. Here, we discuss the application of...
Article
Full-text available
The criminal legal system in the USA drives an incarceration rate that is the highest on the planet, with disparities by class and race among its signature features1–3. During the first year of the coronavirus disease 2019 (COVID-19) pandemic, the number of incarcerated people in the USA decreased by at least 17%—the largest, fastest reduction in p...
Article
Full-text available
Identifying novel drug-target interactions is a critical and rate-limiting step in drug discovery. While deep learning models have been proposed to accelerate the identification process, here we show that state-of-the-art models fail to generalize to novel (i.e., never-before-seen) structures. We unveil the mechanisms responsible for this shortcomi...
Preprint
Full-text available
Identifying shortest paths between nodes in a network is a common graph analysis problem that is important for many applications involving routing of resources. An adversary that can manipulate the graph structure could alter traffic patterns to gain some benefit (e.g., make more money by directing traffic to a toll road). This paper presents the F...
Preprint
Full-text available
The issue of bias (i.e., systematic unfairness) in machine learning models has recently attracted the attention of both researchers and practitioners. For the graph mining community in particular, an important goal toward algorithmic fairness is to detect and mitigate bias incorporated into graph embeddings since they are commonly used in human-cen...
Chapter
Self-propagating malware (SPM) has led to huge financial losses, major data breaches, and widespread service disruptions in recent years. In this paper, we explore the problem of developing cyber resilient systems capable of mitigating the spread of SPM attacks. We begin with an in-depth study of a well-known self-propagating malware, WannaCry, and...
Article
The structure of complex networks can be characterized by counting and analysing network motifs. Motifs are small graph structures that occur repeatedly in a network, such as triangles or chains. Recent work has generalized motifs to temporal and dynamic network data. However, existing techniques do not generalize to sequential or trajectory data,...
Preprint
Self-propagating malware (SPM) has recently resulted in large financial losses and high social impact, with well-known campaigns such as WannaCry and Colonial Pipeline being able to propagate rapidly on the Internet and cause service disruptions. To date, the propagation behavior of SPM is still not well understood, resulting in the difficulty of d...
Article
Full-text available
It is well known that networks generated by common mechanisms such as preferential attachment and homophily can disadvantage the minority group by limiting their ability to establish links with the majority group. This has the effect of limiting minority nodes’ access to information. We present the results of an empirical study on the equality of i...
Preprint
Self-propagating malware (SPM) has led to huge financial losses, major data breaches, and widespread service disruptions in recent years. In this paper, we explore the problem of developing cyber resilient systems capable of mitigating the spread of SPM attacks. We begin with an in-depth study of a well-known self-propagating malware, WannaCry, and...
Preprint
The cyber-threat landscape has evolved tremendously in recent years, with new threat variants emerging daily, and large-scale coordinated campaigns becoming more prevalent. In this study, we propose CELEST (CollaborativE LEarning for Scalable Threat detection), a federated machine learning framework for global threat detection over HTTP, which is o...
Preprint
Full-text available
Are the embeddings of a graph's degenerate core stable? What happens to the embeddings of nodes in the degenerate core as we systematically remove periphery nodes (by repeated peeling off $k$-cores)? We discover three patterns w.r.t. instability in degenerate-core embeddings across a variety of popular graph embedding algorithms and datasets. We us...
Article
Full-text available
The maritime shipping network is the backbone of global trade. Data about the movement of cargo through this network comes in various forms, from ship-level Automatic Identification System (AIS) data, to aggregated bilateral trade volume statistics. Multiple network representations of the shipping system can be derived from any one data source, eac...
Preprint
Full-text available
Identifying novel drug-target interactions (DTI) is a critical and rate limiting step in drug discovery. While deep learning models have been proposed to accelerate the identification process, we show that state-of-the-art models fail to generalize to novel (i.e., never-before-seen) structures. We first unveil the mechanisms responsible for this sh...
Preprint
Full-text available
During the first year of the COVID-19 pandemic, the number of incarcerated people in the United States decreased by at least 16%---the largest, fastest reduction in prison population in American history. Using an original dataset curated from public sources on prison demographics across all 50 states and the District of Columbia, we show that incar...
Preprint
Full-text available
The structure of complex networks can be characterized by counting and analyzing network motifs. Motifs are small subgraphs that occur repeatedly in a network, such as triangles or chains. Recent work has generalized motifs to temporal and dynamic network data. However, existing techniques do not generalize to sequential or trajectory data, which r...
Preprint
The maritime shipping network is the backbone of global trade. Data about the movement of cargo through this network comes in various forms, from ship-level Automatic Identification System (AIS) data, to aggregated bilateral trade volume statistics. Multiple network representations of the shipping system can be derived from any one data source, eac...
Chapter
Full-text available
Shortest paths in complex networks play key roles in many applications. Examples include routing packets in a computer network, routing traffic on a transportation network, and inferring semantic distances between concepts on the World Wide Web. An adversary with the capability to perturb the graph might make the shortest path between two nodes rou...
Preprint
Full-text available
Finding shortest paths in a given network (e.g., a computer network or a road network) is a well-studied task with many applications. We consider this task under the presence of an adversary, who can manipulate the network by perturbing its edge weights to gain an advantage over others. Specifically, we introduce the Force Path Problem as follows....
Preprint
Full-text available
It is well known that networks generated by common mechanisms such as preferential attachment and homophily can disadvantage the minority group by limiting their ability to establish links with the majority group. This has the effect of limiting minority nodes' access to information. In this paper, we present the results of an empirical study on th...
Article
It has been the historic responsibility of the social sciences to investigate human societies. Fulfilling this responsibility requires social theories, measurement models and social data. Most existing theories and measurement models in the social sciences were not developed with the deep societal reach of algorithms in mind. The emergence of ‘algo...
Preprint
Full-text available
Shortest paths in complex networks play key roles in many applications. Examples include routing packets in a computer network, routing traffic on a transportation network, and inferring semantic distances between concepts on the World Wide Web. An adversary with the capability to perturb the graph might make the shortest path between two nodes rou...
Article
Full-text available
Complex networks are often either too large for full exploration, partially accessible, or partially observed. Downstream learning tasks on these incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a result, network discovery algorithms optimized for specific...
Preprint
Full-text available
We present RAWLSNET, a system for altering Bayesian Network (BN) models to satisfy the Rawlsian principle of fair equality of opportunity (FEO). RAWLSNET's BN models generate aspirational data distributions: data generated to reflect an ideally fair, FEO-satisfying society. FEO states that everyone with the same talent and willingness to use it sho...
Article
Full-text available
Studies of networked phenomena, such as interactions in online social media, often rely on incomplete data, either because these phenomena are partially observed, or because the data is too large or expensive to acquire all at once. Analysis of incomplete data leads to skewed or misleading results. In this paper, we investigate limitations of learn...
Preprint
Full-text available
The problem of diffusion control on networks has been extensively studied, with applications ranging from marketing to cybersecurity. However, in many applications, such as targeted vulnerability assessment or clinical therapies, one aspires to affect a targeted subset of a network, while limiting the impact on the rest. We present a novel model in...
Preprint
Complex systems thinking is applied to a wide variety of domains, from neuroscience to computer science and economics. The wide variety of implementations has resulted in two key challenges: the progenation of many domain-specific strategies that are seldom revisited or questioned, and the siloing of ideas within a domain due to inconsistency of co...
Article
Graph embedding seeks to build a low-dimensional representation of a graph $G$. This low-dimensional representation is then used for various downstream tasks. One popular approach is Laplacian Eigenmaps (LE), which constructs a graph embedding based on the spectral properties of the Laplacian matrix of $G$. The intuition behind it, and many other e...
Preprint
Vertex classification is vulnerable to perturbations of both graph topology and vertex attributes, as shown in recent research. As in other machine learning domains, concerns about robustness to adversarial manipulation can prevent potential users from adopting proposed methods when the consequence of action is very high. This paper considers two t...
Preprint
The non-backtracking matrix and its eigenvalues have many applications in network science and graph mining, such as node and edge centrality, community detection, length spectrum theory, graph distance, and epidemic and percolation thresholds. Moreover, in network epidemiology, the reciprocal of the largest eigenvalue of the non-backtracking matrix...
Preprint
Studies of networked phenomena, such as interactions in online social media, often rely on incomplete data, either because these phenomena are partially observed, or because the data is too large or expensive to acquire all at once. Analysis of incomplete data leads to skewed or misleading results. In this paper, we investigate limitations of learn...
Chapter
Complex networks are often either too large for full exploration, partially accessible, or partially observed. Downstream learning tasks on these incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a result, network discovery algorithms optimized for specific...
Preprint
Full-text available
Complex networks are often either too large for full exploration, partially accessible or partially observed. Downstream learning tasks on incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a result, network discovery algorithms optimized for specific downstr...
Conference Paper
Most network analysis is conducted on existing incomplete samples of much larger complete, fully observed graphs. For example, many researchers obtain graphs from online data repositories without knowing how these graphs were collected. Thus, these graphs can be poor representations of the fully observed networks. More complete data would lead to m...
Conference Paper
Two basic tasks in graph analysis are: (1) computing the distance between two graphs and (2) embedding of the graph elements (i.e., nodes or links) into a lower-dimensional space. The former task has numerous applications from k-nearest neighbor search, to clustering a collection of graphs, to transfer learning. Unfortunately, there exists no canon...
Article
Full-text available
Graph distance and graph embedding are two fundamental tasks in graph mining. For graph distance, determining the structural dissimilarity between networks is an ill-defined problem, as there is no canonical way to compare two networks. Indeed, many of the existing approaches for network comparison differ in their heuristics, efficiency, interpreta...
Conference Paper
As data science and artificial intelligence become ubiquitous, they have an increasing impact on society. While many of these impacts are beneficial, others may not be. So understanding and managing these impacts is required of every responsible data scientist. Nevertheless, most human decision-makers use algorithms for efficiency purposes and not...
Article
The unsupervised detection of anomalies in time series data has important applications, e.g., in user behavioural modelling, fraud detection, and cybersecurity. Anomaly detection has been extensively studied in categorical sequences, however we often have access to time series data that contain paths through networks. Examples include transaction s...
Preprint
Graph embedding seeks to build a low-dimensional representation of a graph G. This low-dimensional representation is then used for various downstream tasks. One popular approach is Laplacian Eigenmaps, which constructs a graph embedding based on the spectral properties of the Laplacian matrix of G. The intuition behind it, and many other embedding...
Conference Paper
Network alignment, which aims to find the node correspondence across multiple networks, is a fundamental task in many areas, ranging from social network analysis to adversarial activity detection. The state-of-the-art in the data mining community often view the node correspondence as a probabilistic cross-network node similarity, and thus inevitabl...
Conference Paper
Full-text available
The concept of k-cores is important for understanding the global structure of networks, as well as for identifying central or important nodes within a network. It is often valuable to understand the resilience of the k-cores of a network to attacks and dropped edges (i.e., damaged communications links). We provide a formal definition of a network»s...
Article
Full-text available
How do the k-core structures of real-world graphs look like? What are the common patterns and the anomalies? How can we exploit them for applications? A k-core is the maximal subgraph in which all vertices have degree at least k. This concept has been applied to such diverse areas as hierarchical structure analysis, graph visualization, and graph c...
Article
Role discovery in graphs is an emerging area that allows analysis of complex graphs in an intuitive way. In contrast to other graph prob- lems such as community discovery, which finds groups of highly connected nodes, the role discovery problem finds groups of nodes that share similar graph topological structure. However, existing work so far has t...
Article
We study the impact of network information for social security fraud detection. In a social security system, companies have to pay taxes to the government. This study aims to identify those companies that intentionally go bankrupt to avoid contributing their taxes. We link companies to each other through their shared resources, because some resourc...
Conference Paper
We study the problem of determining the proper aggregation granularity for a stream of time-stamped edges. Such streams are used to build time-evolving networks, which are subsequently used to study topics such as network growth. Currently, aggregation lengths are chosen arbitrarily, based on intuition or convenience. We describe ADAGE, which detec...
Conference Paper
Fraud is a social process that occurs over time. We introduce a new approach, called AFRAID, which utilizes active inference to better detect fraud in time-varying social networks. That is, classify nodes as fraudulent vs. non-fraudulent. In active inference on social networks, a set of unlabeled nodes is given to an oracle (in our case one or more...
Chapter
Given the topology of a graph G and a budget k, how can we quickly find the best k edges to delete that minimize dissemination in G? Stopping dissemination in a graph is important in a variety of fields from epidemiology to cyber security. The spread of an entity (e.g., a virus) on an arbitrary graph depends on two properties: (1) the topology of t...
Conference Paper
The fairly recent explosion in the availability of reasonably fast wireless and mobile data networks has spurred demand for more capable mobile computing devices. Conversely, the emergence of new devices increases demand for better networks, creating a virtuous cycle. The current concept of a smartphone as an always-connected computing device with...
Article
In the last decade, the ease of online payment has opened up many new opportunities for e-commerce, lowering the geographical boundaries for retail. While e-commerce is still gaining popularity, it is also the playground of fraudsters who try to misuse the transparency of online purchases and the transfer of credit card records. This paper proposes...
Conference Paper
Given a collection of m continuous-valued, one-dimensional empirical probability distributions {P1, ..., Pm}, how can we cluster these distributions efficiently with a nonparametric approach? Such problems arise in many real-world settings where keeping the moments of the distribution is not appropriate, because either some of the moments are not d...
Article
Given a labeled graph containing fraudulent and legitimate nodes, which nodes group together? How can we use the riskiness of node groups to infer a future label for new members of a group? This paper focuses on social security fraud where companies are linked to the resources they use and share. The primary goal in social security fraud is to dete...
Article
We address the problem of search on graphs with multiple nodal attributes. We call such graphs weighted attribute graphs (WAGs). Nodes of a WAG exhibit multiple attributes with varying, non-negative weights. WAGs are ubiquitous in real-world applications. For example, in a co-authorship WAG, each author is a node; each attribute corresponds to a pa...
Article
Given a large graph, like a computer communication network, which k nodes should we immunize (or monitor, or remove), to make it as robust as possible against a computer virus attack? This problem, referred to as the node immunization problem, is the core building block in many high-impact applications, ranging from public health, cybersecurity to...
Conference Paper
To conduct a successful targeting campaign in mobile advertising, one needs to have reliable location information from real-time bid requests. However, many real-time bid requests do not include fine-grained location information (such as latitude and longitude) because (1) the device or the application did not collect that information or (2) some c...
Conference Paper
Given a set of k networks, possibly with different sizes and no overlaps in nodes or links, how can we quickly assess similarity between them? Analogously, are there a set of social theories which, when represented by a small number of descriptive, numerical features, effectively serve as a "signature" for the network? Having such signatures will e...
Conference Paper
Mobile connected devices, and smartphones in particular, are rapidly emerging as a dominant computing and sensing platform. This poses several unique opportunities for data collection and analysis, as well as new challenges. In this tutorial, we survey the state-of-the-art in terms of mining data from mobile devices across different application are...
Conference Paper
Role discovery in graphs is an emerging area that allows analysis of complex graphs in an intuitive way. In contrast to community discovery, which finds groups of highly connected nodes, role discovery finds groups of nodes that share similar topological structure in the graph, and hence a common role (or function) such as being a broker or a perip...
Conference Paper
Controlling the dissemination of an entity (e.g., meme, virus, etc) on a large graph is an interesting problem in many disciplines. Examples include epidemiology, computer security, marketing, etc. So far, previous studies have mostly focused on removing or inoculating nodes to achieve the desired outcome. We shift the problem to the level of edges...

Network

Cited By