Laks Lakshmanan

Laks Lakshmanan
University of British Columbia | UBC · Department of Computer Science

PhD

About

376
Publications
51,923
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
17,629
Citations
Additional affiliations
January 2001 - December 2011
Columbia University
January 2001 - December 2011
University of British Columbia

Publications

Publications (376)
Preprint
Full-text available
Large language models (LLMs) have demonstrated remarkable capabilities in comprehending and generating natural language content, attracting widespread popularity in both industry and academia in recent years. An increasing number of services have sprung up which offer LLMs for various tasks via APIs. Different LLMs demonstrate expertise in differen...
Preprint
We study the problem of robust influence maximization in dynamic diffusion networks. In line with recent works, we consider the scenario where the network can undergo insertion and removal of nodes and edges, in discrete time steps, and the influence weights are determined by the features of the corresponding nodes and a global hyperparameter. Give...
Preprint
Full-text available
Influence maximization (IM) is a classic problem that aims to identify a small group of critical individuals, known as seeds, who can influence the largest number of users in a social network through word-of-mouth. This problem finds important applications including viral marketing, infection detection, and misinformation containment. The conventio...
Preprint
Full-text available
As a fundamental topic in graph mining, Densest Subgraph Discovery (DSD) has found a wide spectrum of real applications. Several DSD algorithms, including exact and approximation algorithms, have been proposed in the literature. However, these algorithms have not been systematically and comprehensively compared under the same experimental settings....
Article
Given a graph G , a motif (e.g., 3-node clique) is a fundamental building block for G. Recently, motif-based graph analysis has attracted much attention due to its efficacy in tasks such as clustering, ranking, and link prediction. These tasks require Network Motif Discovery (NMD) at the early stage to identify the motifs of G. However, existing NM...
Conference Paper
Self-training (ST) has come to fruition in language understanding tasks by producing pseudo labels, which reduces the labeling bottleneck of language model fine-tuning. Nevertheless, in facilitating semi-supervised controllable language generation, ST faces two key challenges. First, augmented by self-generated pseudo text, generation models tend t...
Article
Full-text available
Given a directed graph G, the directed densest subgraph (DDS) problem refers to finding a subgraph from G, whose density is the highest among all subgraphs of G. The DDS problem is fundamental to a wide range of applications, such as fake follower detection and community mining. Theoretically, the DDS problem closely connects to other essential gra...
Article
While social networks greatly facilitate information dissemination, they are well known to have contributed to the phenomena of filter bubbles and echo chambers. This in turn can lead to societal polarization and erosion of trust in public institutions. Mitigating filter bubbles is an urgent open problem. Recently, approaches based on the influence...
Preprint
Self-training (ST) has come to fruition in language understanding tasks by producing pseudo labels, which reduces the labeling bottleneck of language model fine-tuning. Nevertheless, in facilitating semi-supervised controllable language generation, ST faces two key challenges. First, augmented by self-generated pseudo text, generation models tend t...
Preprint
Full-text available
Weight-sharing supernet has become a vital component for performance estimation in the state-of-the-art (SOTA) neural architecture search (NAS) frameworks. Although supernet can directly generate different subnetworks without retraining, there is no guarantee for the quality of these subnetworks because of weight sharing. In NLP tasks such as machi...
Preprint
With the prevalence of graphs for modeling complex relationships among objects, the topic of graph mining has attracted a great deal of attention from both academic and industrial communities in recent years. As one of the most fundamental problems in graph mining, the densest subgraph discovery (DSD) problem has found a wide spectrum of real appli...
Article
The question of answering queries over ML predictions has been gaining attention in the database community. This question is challenging because finding high quality answers by invoking an oracle such as a human expert or an expensive deep neural network model on every single item in the DB and then applying the query, can be prohibitive. We develo...
Article
You just got promoted to Associate Professor. Like most things in life, whether joys or sorrows, the joy of this accomplishment will not last forever. However, that doesn't mean that you should not look back and reflect on years of hard work and tenacity that you have put in which have earned you this promotion, so first of all, congratulations! Ta...
Article
Full-text available
In applications such as biological, social, and transportation networks, interactions between objects span multiple aspects. For accurately modeling such applications, multilayer networks have been proposed. Community search allows for personalized community discovery and has a wide range of applications in large real-world networks. While communit...
Preprint
Self-training (ST) has prospered again in language understanding by augmenting the fine-tuning of pre-trained language models when labeled data is insufficient. However, it remains challenging to incorporate ST into attribute-controllable language generation. Augmented by only self-generated pseudo text, generation models over-emphasize exploitatio...
Preprint
Full-text available
The prevalence of abusive language on different online platforms has been a major concern that raises the need for automated cross-platform abusive language detection. However, prior works focus on concatenating data from multiple platforms, inherently adopting Empirical Risk Minimization (ERM) method. In this work, we address this challenge from t...
Preprint
Contrastive learning (CL) brought significant progress to various NLP tasks. Despite this progress, CL has not been applied to Arabic NLP to date. Nor is it clear how much benefits it could bring to particular classes of tasks such as those involved in Arabic social meaning (e.g., sentiment analysis, dialect identification, hate speech detection)....
Preprint
Full-text available
Neural architecture search (NAS) has demonstrated promising results on identifying efficient Transformer architectures which outperform manually designed ones for natural language tasks like neural machine translation (NMT). Existing NAS methods operate on a space of dense architectures, where all of the sub-architecture weights are activated for e...
Preprint
Full-text available
Autocomplete is a task where the user inputs a piece of text, termed prompt, which is conditioned by the model to generate semantically coherent continuation. Existing works for this task have primarily focused on datasets (e.g., email, chat) with high frequency user prompt patterns (or focused prompts) where word-based language models have been qu...
Preprint
We investigate the novel problem of voting-based opinion maximization in a social network: Find a given number of seed nodes for a target campaigner, in the presence of other competing campaigns, so as to maximize a voting-based score for the target campaigner at a given time horizon. The bulk of the influence maximization literature assumes that s...
Preprint
A key graph mining primitive is extracting dense structures from graphs, and this has led to interesting notions such as $k$-cores which subsequently have been employed as building blocks for capturing the structure of complex networks and for designing efficient approximation algorithms for challenging problems such as finding the densest subgraph...
Article
Finding the densest subgraph (DS) from a graph is a fundamental problem in graph databases. The DS obtained, which reveals closely related entities, has been found to be useful in various application domains such as e-commerce, social science, and biology. However, in a big graph that contains billions of edges, it is desirable to find more than on...
Preprint
Full-text available
We propose an information propagation model that captures important temporal aspects that have been well observed in the dynamics of fake news diffusion, in contrast with the diffusion of truth. The model accounts for differential propagation rates of truth and misinformation and for user reaction times. We study a time-sensitive variant of the \te...
Preprint
The question of answering queries over ML predictions has been gaining attention in the database community. This question is challenging because the cost of finding high quality answers corresponds to invoking an oracle such as a human expert or an expensive deep neural network model on every single item in the DB and then applying the query. We de...
Article
We propose an information propagation model that captures important temporal aspects that have been well observed in the dynamics of fake news diffusion, in contrast with the diffusion of truth. The model accounts for differential propagation rates of truth and misinformation and for user reaction times. We study a time-sensitive variant of the mis...
Preprint
Full-text available
In applications such as biological, social, and transportation networks, interactions between objects span multiple aspects. For accurately modeling such applications, multilayer networks have been proposed. Community search allows for personalized community discovery and has a wide range of applications in large real-world networks. While communit...
Conference Paper
Full-text available
A key graph mining primitive is extracting dense structures from graphs, and this has led to interesting notions such as $k$-cores which subsequently have been employed as building blocks for capturing the structure of complex networks and for designing efficient approximation algorithms for challenging problems such as finding the densest subgraph...
Preprint
Full-text available
In this work, we focus on the problem of distinguishing a human written news article from a news article that is created by manipulating entities in a human written news article (e.g., replacing entities with factually incorrect entities). Such manipulated articles can mislead the reader by posing as a human written news article. We propose a neura...
Article
Full-text available
Influence maximization (IM) under a continuous-time diffusion model requires finding a set of initial adopters which when activated lead to the maximum expected number of users becoming activated within a given amount of time. State-of-the-art approximation algorithms applicable to solving this intractable problem use reverse reachability influence...
Article
Given a directed graph G , the directed densest subgraph (DDS) problem refers to the finding of a subgraph from G , whose density is the highest among all the subgraphs of G . The DDS problem is fundamental to a wide range of applications, such as fraud detection, community mining, and graph compression. However, existing DDS solutions suffer from...
Preprint
We consider the revenue maximization problem in social advertising, where a social network platform owner needs to select seed users for a group of advertisers, each with a payment budget, such that the total expected revenue that the owner gains from the advertisers by propagating their ads in the network is maximized. Previous studies on this pro...
Article
Given a directed graph G, the directed densest subgraph (DDS) problem refers to the finding of a subgraph from G, whose density is the highest among all the subgraphs of G. The DDS problem is fundamental to a wide range of applications, such as fraud detection, community mining, and graph compression. However, existing DDS solutions suffer from eff...
Preprint
We describe models focused at the understudied problem of translating between monolingual and code-mixed language pairs. More specifically, we offer a wide range of models that convert monolingual English text into Hinglish (code-mixed Hindi and English). Given the recent success of pretrained language models, we also test the utility of two recent...
Preprint
Full-text available
Influence maximization (IM) has garnered a lot of attention in the literature owing to applications such as viral marketing and infection containment. It aims to select a small number of seed users to adopt an item such that adoption propagates to a large number of users in the network. Competitive IM focuses on the propagation of competing items i...
Article
Influence maximization (IM) has garnered a lot of attention in the literature owing to applications such as viral marketing and infection containment. It aims to select a small number of seed users to adopt an item such that adoption propagates to a large number of users in the network. Competitive IM focuses on the propagation of competing items i...
Preprint
Full-text available
Text generative models (TGMs) excel in producing text that matches the style of human language reasonably well. Such TGMs can be misused by adversaries, e.g., by automatically generating fake news and fake product reviews that can look authentic and fool humans. Detectors that can distinguish text generated by TGM from human written text play a vit...
Article
Full-text available
The abundant availability of health-care data calls for effective analysis methods to help medical experts gain a better understanding of their patients and their health. The focus of existing work has been largely on prediction. In this paper, we introduce Core, a framework for cohort “representation” and “exploration.” Our contributions are twofo...
Article
In graph applications (e.g., biological and social networks), various analytics tasks (e.g., clustering and community search) are carried out to extract insight from large and complex graphs. Central to these tasks is the counting of the number of motifs , which are graphs with a few nodes. Recently, researchers have developed several fast motif co...
Article
Fake news is a major threat to global democracy resulting in diminished trust in government, journalism and civil society. The public popularity of social media and social networks has caused a contagion of fake news where conspiracy theories, disinformation and extreme views flourish. Detection and mitigation of fake news is one of the fundamental...
Preprint
As a dual problem of influence maximization, the seed minimization problem asks for the minimum number of seed nodes to influence a required number $\eta$ of users in a given social network $G$. Existing algorithms for seed minimization mostly consider the non-adaptive setting, where all seed nodes are selected in one batch without observing how th...
Article
Full-text available
We propose an approach for fitting linear regression models that splits the set of covariates into groups. Essentially, the regression coefficients of the variables in each group are estimated separately from the other groups. The estimated coefficients are then pooled together to form the final fit. The optimal split of the variables into groups a...
Article
Densest subgraph discovery (DSD) is a fundamental problem in graph mining. It has been studied for decades, and is widely used in various areas, including network science, biological analysis, and graph databases. Given a graph G, DSD aims to find a subgraph D of G with the highest density (e.g., the number of edges over the number of vertices in D...
Conference Paper
As a dual problem of influence maximization, the seed minimization problem asks for the minimum number of seed nodes to influence a required number η of users in a given social network G. Existing algorithms for seed minimization mostly consider the non-adaptive setting, where all seed nodes are selected in one batch without observing how they may...
Conference Paper
Full-text available
Motivated by applications such as viral marketing, the problem of influence maximization (IM) has been extensively studied in the literature. The goal is to select a small number of users to adopt an item such that it results in a large cascade of adoptions by others. Existing works have three key limitations. (1) They do not account for economic c...
Preprint
Densest subgraph discovery (DSD) is a fundamental problem in graph mining. It has been studied for decades, and is widely used in various areas, including network science, biological analysis, and graph databases. Given a graph G, DSD aims to find a subgraph D of G with the highest density (e.g., the number of edges over the number of vertices in D...
Chapter
This chapter discusses the problem of community search in simple graphs, and focuses on just the structural characteristics of networks. In this simplest setting, a graph represents a structure of interactions within a group of vertices. We consider an undirected, unweighted simple graph G = (V(G), E(G)) with n = |V(G)| vertices and m = |E(G)| edge...
Chapter
In many real applications where information is modeled using graphs, communities are formed by a set of similar entities that are densely connected with certain relationships. Massive networks can often be understood and analyzed in terms of these communities [165]. In the literature, several different models for cohesive (dense) subgraphs and comm...
Chapter
In Chapter 3, we give an over view of community search in simple graphs, which only capture the structural characteristics of networks. Community search on a simple graph aims to find densely connected communities containing all query nodes. In applications such as analysis of protein protein interaction (PPI) networks, citation graphs, and collabo...
Chapter
The prosperity of smartphones and other smart devices and the popularity of social networking have led to the rapid growth of geo-social networks, which are also known as location-based social networks (LBSNs). A geo-social network may contain communities that are spatially proximate. Most structural community search algorithms discussed in the pre...
Chapter
In this chapter, we discuss a special kind ofcommunities arising in social networks, called social circles. For a user, the subgraph of the entire network induced only by his or her friends is called an ego-network. Online social networks allow users to manually categorize their friends into different social circles within their ego-networks (e.g.,...
Chapter
Research on community search is motivated by real-world applications. The availability of datasets from such applications, or real-world datasets that closely resemble them, is important to validate the models about how communities are formed in practice under a variety of conditions.
Chapter
This chapter first lists the community search models that are not detailed in the previous chapters. We then conclude the book by discussing future directions and open problems for further research in community search over large graphs.
Preprint
Full-text available
In economics, it is well accepted that adoption of items is governed by the utility that a user derives from their adoption. In this paper, we propose a model called EPIC that combines utility-driven item adoption with the viral network effect helping to propagate adoption of and desire for items from users to their peers. We focus on the case of m...
Conference Paper
We study the problem of Query Reverse Engineering (QRE), where given a database and an output table, the task is to find a simple project-join SQL query that generates that table when applied on the database. This problem is known for its efficiency challenge due to mainly two reasons. First, the problem has a very large search space and its variou...
Conference Paper
Starting with the earliest studies showing that the spread of new trends, information, and innovations is closely related to the social influence exerted on people by their social networks, the research on social influence theory took off, providing remarkable evidence on social influence induced viral phenomena. Fueled by the extreme popularity of...
Preprint
We propose an approach for fitting linear regression models that splits the set of covariates into groups. The optimal split of the variables into groups and the regularized estimation of the regression coefficients are performed by minimizing an objective function that encourages sparsity within each group and diversity among them. The estimated c...
Article
Full-text available
In a recent SIGMOD paper titled "Debunking the Myths of Influence Maximization: An In-Depth Benchmarking Study", Arora et al. [1] undertake a performance benchmarking study of several well-known algorithms for influence maximization. In the process, they contradict several published results, and claim to have unearthed and debunked several "myths"...
Article
Full-text available
Influence maximization is a combinatorial optimization problem that finds important applications in viral marketing, feed recommendation, etc. Recent research has led to a number of scalable approximation algorithms for influence maximization, such as TIM⁺ and IMM, and more recently, SSA and D-SSA. The goal of this paper is to conduct a rigorous th...
Article
Recently, community search over graphs has gained significant interest. In applications such as analysis of protein-protein interaction (PPI) networks, citation graphs, and collaboration networks, nodes tend to have attributes. Unfortunately, most previous community search algorithms ignore attributes and result in communities with poor cohesion w....
Conference Paper
Online rated datasets have become a source for large-scale population studies for analysts and a means for end-users to achieve routine tasks such as finding a book club. Existing systems however only provide limited insights into the opinions of different segments of the rater population. In this paper, we develop a framework for finding and explo...
Article
Full-text available
The gang of bandits (GOB) model \cite{cesa2013gang} is a recent contextual bandits framework that shares information between a set of bandit problems, related by a known (possibly noisy) graph. This model is useful in problems like recommender systems where the large number of users makes it vital to transfer information between users. Despite its...
Article
Full-text available
We consider \emph{influence maximization} (IM) in social networks, which is the problem of maximizing the number of users that become aware of a product by selecting a set of "seed" users to expose the product to. While prior work assumes a known model of information diffusion, we propose a parametrization in terms of pairwise reachability which ma...