
My T. ThaiUniversity of Florida | UF · Department of Computer and Information Science and Engineering
My T. Thai
PhD in Computer Science
About
321
Publications
44,540
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,844
Citations
Introduction
Additional affiliations
August 2006 - September 2017
July 2006 - present
Publications
Publications (321)
Motivation
Network motif identification (MI) problem aims to find topological patterns in biological networks. Identifying disjoint motifs is a computationally challenging problem using classical computers. Quantum computers enable solving high complexity problems which do not scale using classical computers. In this article, we develop the first q...
Understanding descriptive positive and negative norms, such as their propagation speed, is crucial for shaping individuals’ behaviors towards public health guidelines and designing successful promotion campaigns to encourage positive norms. Unfortunately, conducting in-depth analyses related to community characteristics and the diffusion of descrip...
Understanding the COVID-19 vaccine hesitation, such as who and why, is significant as the adoption of widespread vaccination remains the most effective approach to controlling pandemic outbreaks. Such an understanding can further aid in developing effective immunization programs for future pandemics. Unfortunately, there are numerous aspects to con...
Providing textual concept-based explanations for neurons in deep neural networks (DNNs) is of importance in understanding how a DNN model works. Prior works have associated concepts with neurons based on examples of concepts or a pre-defined set of concepts, thus limiting possible explanations to what the user expects, especially in discovering new...
Quantum Annealing (QA) holds great potential for solving combinatorial optimization problems efficiently. However, the effectiveness of QA algorithms heavily relies on the embedding of problem instances, represented as logical graphs, into the quantum unit processing (QPU) whose topology is in form of a limited connectivity graph, known as the mino...
Graph Neural Networks (GNNs) exhibit tremendous potential in addressing graph-related tasks such as node classification and link prediction. However, training GNNs on large-scale graphs poses computational and memory challenges for centralized training. Graph data is often distributed across multiple parties, and Federated Learning (FL) offers a co...
Alzheimer’s Disease (AD) is a progressive neurodegenerative disease and the leading cause of dementia. Early diagnosis is critical for patients to benefit from potential intervention and treatment. The retina has emerged as a plausible diagnostic site for AD detection owing to its anatomical connection with the brain. However, existing AI models fo...
Network vulnerability assessment, in which a communication between nodes is functional if their distance under a given metric is lower than a pre-defined threshold, has received significant attention recently. However, those works only focused on discrete domain while many practical applications require us to investigate in the continuous domain. M...
p>A multilayered graph is a dispensable data representation tool to comprehend and mine the richness and complexity of complex systems in real-world scenarios. Multifaceted interconnecting relationships cast in multiple layers of the graph construct a holistic, versatile, and powerful framework that enables researchers to effectively process inform...
p>A multilayered graph is a dispensable data representation tool to comprehend and mine the richness and complexity of complex systems in real-world scenarios. Multifaceted interconnecting relationships cast in multiple layers of the graph construct a holistic, versatile, and powerful framework that enables researchers to effectively process inform...
Fairness has been touted as one of the most important issues for responsible AI as sophisticated AI-powered systems increasingly impact human lives. At the same time, access to information is essential in today’s knowledge economy and fundamental to American democracy. However, certain groups of the population might be excluded or lack access to pa...
Target identification by enzymes (TIE) problem aims to identify the set of enzymes in a given metabolic network, such that their inhibition eliminates a given set of target compounds associated with a disease while incurring minimum damage to the rest of the compounds. This is a NP-hard problem, and thus optimal solutions using classical computers...
This work, for the first time, introduces two constant factor approximation algorithms with linear query complexity for non-monotone submodular maximization over a ground set of size n subject to a knapsack constraint, DLA and RLA. DLA is a deterministic algorithm that provides an approximation factor of nearly 6 while RLA is a randomized algorithm...
Fake news is one of the most prominent forms of disinformation. Unfortunately, today’s advanced social media platforms allow for the rapid transmission of fake news, which may negatively impact several aspects of human society. Despite the significant progress in detecting fake news, the focus of most current work lies in the detection based on con...
Quantum annealing (QA) has emerged as a powerful technique to solve optimization problems by taking advantages of quantum physics. In QA process, a bottleneck that may prevent QA to scale up is minor embedding step in which we embed optimization problems represented by a graph, called logical graph, to Quantum Processing Unit (QPU) topology of quan...
Recent development in the field of explainable artificial intelligence (XAI) has helped improve trust in Machine-Learning-as-a-Service (MLaaS) systems, in which an explanation is provided together with the model prediction in response to each query. However, XAI also opens a door for adversaries to gain insights into the black-box models in MLaaS,...
This paper introduces FairDP, a novel mechanism designed to simultaneously ensure differential privacy (DP) and fairness. FairDP operates by independently training models for distinct individual groups, using group-specific clipping terms to assess and bound the disparate impacts of DP. Throughout the training process, the mechanism progressively i...
This work, for the first time, introduces two constant factor approximation algorithms with linear query complexity for non-monotone submodular maximization over a ground set of size $n$ subject to a knapsack constraint, $\mathsf{DLA}$ and $\mathsf{RLA}$. $\mathsf{DLA}$ is a deterministic algorithm that provides an approximation factor of $6+\epsil...
Influence maximization (IM) is formulated as selecting a set of initial users from a social network to maximize the expected number of influenced users. Researchers have made great progresses to design various traditional methods, and their theoretical design and performance gain are close to a limit. In the past few years, learning-based IM method...
Influence maximization (IM) is formulated as selecting a set of initial users from a social network to maximize the expected number of influenced users. Researchers have made great progress in designing various traditional methods, and their theoretical design and performance gain are close to a limit. In the past few years, learning-based IM metho...
Federated Learning (FL) has gained popularity for
its ability to improve model training while protecting user
privacy. However, recent studies have shown that FL can be
vulnerable to active reconstruction attacks by dishonest servers.
Specifically, a dishonest server can obtain users’ private data
in numerous ways via gradient inversion based on th...
Understanding the COVID-19 vaccine hesitancy, such as who and why, is very crucial since a large-scale vaccine adoption remains as one of the most efficient methods of controlling the pandemic. Such an understanding also provides insights into designing successful vaccination campaigns for future pandemics. Unfortunately, there are many factors inv...
Target Identification by Enzymes (TIE) problem aims to identify the set of enzymes in a given metabolic network, such that their inhibition eliminates a given set of target compounds associated with a disease while incurring minimum damage to the rest of the compounds. This is an NP-complete problem, and thus optimal solutions using classical compu...
Federated learning (FL) was originally regarded as a framework for collaborative learning among clients with data privacy protection through a coordinating server. In this paper, we propose a new active membership inference (AMI) attack carried out by a dishonest server in FL. In AMI attacks, the server crafts and embeds malicious parameters into g...
Federated learning (FL) was originally regarded as a framework for collaborative learning among clients with data privacy protection through a coordinating server. In this paper, we propose a new active membership inference (AMI) attack carried out by a dishonest server in FL. In AMI attacks, the server crafts and embeds malicious parameters into g...
Alzheimer's Disease (AD) is a progressive neurodegenerative disease and the leading cause of dementia. Early diagnosis is critical for patients to benefit from potential intervention and treatment. The retina has been hypothesized as a diagnostic site for AD detection owing to its anatomical connection with the brain. Developed AI models for this p...
Although the effects of the social norm on mitigating misinformation are identified, scant knowledge exists about the patterns of social norm emergence, such as the patterns and variations of social tipping in online communities with diverse characteristics. Accordingly, this study investigates the features of social tipping in online communities a...
Federated learning is known to be vulnerable to both security and privacy issues. Existing research has focused either on preventing poisoning attacks from users or on concealing the local model updates from the server, but not both. However, integrating these two lines of research remains a crucial challenge since they often conflict with one anot...
Recent development in the field of explainable artificial intelligence (XAI) has helped improve trust in Machine-Learning-as-a-Service (MLaaS) systems, in which an explanation is provided together with the model prediction in response to each query. However, XAI also opens a door for adversaries to gain insights into the black-box models in MLaaS,...
Recent development in the field of explainable artificial intelligence (XAI) has helped improve trust in Machine-Learning-as-a-Service (MLaaS) systems, in which an explanation is provided together with the model prediction in response to each query. However, XAI also opens a door for adversaries to gain insights into the black-box models in MLaaS,...
Temporal Graph Neural Network (TGNN) has been receiving a lot of attention recently due to its capability in modeling time-evolving graph-related tasks. Similar to Graph Neural Networks, it is also non-trivial to interpret predictions made by a TGNN due to its black-box nature. A major approach tackling this problems in GNNs is by analyzing the mod...
Graph neural networks (GNNs) are susceptible to privacy inference attacks (PIAS) given their ability to learn joint representation from features and edges among nodes in graph data. To prevent privacy leakages in GNNs, we propose a novel heterogeneous randomized response (HETERORR) mechanism to protect nodes' features and edges against PIAS under d...
Graph neural networks (GNNs) are susceptible to privacy inference attacks (PIAs), given their ability to learn joint representation from features and edges among nodes in graph data. To prevent privacy leakages in GNNs, we propose a novel heterogeneous randomized response (HeteroRR) mechanism to protect nodes' features and edges against PIAs under...
Despite recent studies on understanding deep neural networks (DNNs), there exists numerous questions on how DNNs generate their predictions. Especially, given similar predictions on different input samples, are the underlying mechanisms generating those predictions the same? In this work, we propose NeuCEPT, a method to locally discover critical ne...
In the last few years, many explanation methods based on the perturbations of input data have been introduced to improve our understanding of decisions made by black-box models. The goal of this work is to introduce a novel perturbation scheme so that more faithful and robust explanations can be obtained. Our study focuses on the impact of perturbi...
Temporal graph neural networks (TGNNs) have been widely used for modeling time-evolving graph-related tasks due to their ability to capture both graph topology dependency and non-linear temporal dynamic. The explanation of TGNNs is of vital importance for a transparent and trustworthy model. However, the complex topology structure and temporal depe...
In this paper, we show that the process of continually learning new tasks and memorizing previous tasks introduces unknown privacy risks and challenges to bound the privacy loss. Based upon this, we introduce a formal definition of Lifelong DP, in which the participation of any data tuples in the training set of any tasks is protected, under a cons...
In this work, we study the problem of monotone non-submodular maximization with partition matroid constraint. Although a generalization of this problem has been studied in literature, our work focuses on leveraging properties of partition matroid constraint to (1) propose algorithms with theoretical bound and efficient query complexity; and (2) pro...
Despite the great potential of Federated Learning (FL) in large-scale distributed learning, the current system is still subject to several privacy issues due to the fact that local models trained by clients are exposed to the central server. Consequently, secure aggregation protocols for FL have been developed to conceal the local models from the s...
Quantum annealing (QA) that encodes optimization problems into Hamiltonians remains the only near-term quantum computing paradigm that provides sufficient many qubits for real-world applications. To fit larger optimization instances on existing quantum annealers, reducing Hamiltonians into smaller equivalent Hamiltonians provides a promising approa...
In this work, we study the problem of monotone non-submodular maximization with partition matroid constraint. Although a generalization of this problem has been studied in literature, our work focuses on leveraging properties of partition matroid constraint to (1) propose algorithms with theoretical bound and efficient query complexity; and (2) pro...
Ranking nodes based on their centrality stands a fundamental, yet, challenging problem in large-scale networks. Approximate methods can quickly estimate nodes' centrality and identify the most central nodes, but the ranking for the majority of remaining nodes may be meaningless. For example, ranking for less-known websites in search queries is know...
Federated learning is known to be vulnerable to security and privacy issues. Existing research has focused either on preventing poisoning attacks from users or on protecting user privacy of model updates. However, integrating these two lines of research remains a crucial challenge since they often conflict with one another with respect to the threa...
Since 2016, sharding has become an auspicious solution to tackle the scalability issue in legacy blockchain systems. Despite its potential to strongly boost the blockchain throughput, sharding comes with its own security issues. To ease the process of deciding which shard to place transactions, existing sharding protocols use a hash-based transacti...
This paper studies a Group Influence with Minimum cost which aims to find a seed set with smallest cost that can influence all target groups, where each user is associated with a cost and a group is influenced if the total score of the influenced users belonging to the group is at least a certain threshold. As the group-influence function is neithe...
In this paper, we focus on preserving differential privacy (DP) in continual learning (CL), in which we train ML models to learn a sequence of new tasks while memorizing previous tasks. We first introduce a notion of continual adjacent databases to bound the sensitivity of any data record participating in the training process of CL. Based upon that...
Knowledge Distillation (KD) has been considered as a key solution in model compression and acceleration in recent years. In KD, a small student model is generally trained from a large teacher model by minimizing the divergence between the probabilistic outputs of the two. However, as demonstrated in our experiments, existing KD methods might not tr...
In this paper, we focus on preserving differential privacy (DP) in continual learning (CL), in which we train ML models to learn a sequence of new tasks while memorizing previous tasks. We first introduce a notion of continual adjacent databases to bound the sensitivity of any data record participating in the training process of CL. Based upon that...
In this paper, we focus on preserving differential privacy (DP) in continual learning (CL), in which we train ML models to learn a sequence of new tasks while memorizing previous tasks. We first introduce a notion of continual adjacent databases to bound the sensitivity of any data record participating in the training process of CL. Based upon that...
Stimulated by practical applications arising from economics, viral marketing and elections, this paper studies a novel Group Influence with Minimal cost which aims to find a seed set with smallest cost that can influence all target groups, where each user is associated with a cost and a group is influenced if the total score of the influenced users...
Bitcoin is the leading example of a blockchain application that facilitates peer-to-peer transactions without the need for a trusted third party. This paper considers possible attacks related to the decentralized network architecture of Bitcoin. We perform a data driven study of Bitcoin and present possible attacks based on spatial and temporal cha...
In this paper, we study a novel problem, Minimum Robust Multi-Submodular Cover for Fairness (MinRF), as follows: given a ground set V; m monotone submodular functions f_1,...,f_m; m thresholds T_1,...,T_m and a non-negative integer r; MinRF asks for the smallest set S such that f_i(S \ X) ≥ T_i for all i ∈ [m] and |X| ≤ r. We prove that MinRF is in...
Black box has been an important tool in studying computational complexity theory and has been used for establishing the hardness of problems. With an exponential growth in big data recently, data-driven computation has utilized black box as a tool for proving solutions to some computational problems. In this note, we present several observations on...
In this paper, we study a novel problem, Minimum Robust Multi-Submodular Cover for Fairness (MinRF), as follows: given a ground set $V$; $m$ monotone submodular functions $f_1,...,f_m$; $m$ thresholds $T_1,...,T_m$ and a non-negative integer $r$, MinRF asks for the smallest set $S$ such that for all $i \in [m]$, $\min_{|X| \leq r} f_i(S \setminus X...
In Graph Neural Networks (GNNs), the graph structure is incorporated into the learning of node representations. This complex structure makes explaining GNNs' predictions become much more challenging. In this paper, we propose PGM-Explainer, a Probabilistic Graphical Model (PGM) model-agnostic explainer for GNNs. Given a prediction to be explained,...
Graph mining plays a pivotal role across a number of disciplines, and a variety of algorithms have been developed to answer who/what type questions. For example, what items shall we recommend to a given user on an e-commerce platform? The answers to such questions are typically returned in the form of a ranked list, and graph-based ranking methods...
Studying on networked systems, in which a communication between nodes is functional if their distance under a given metric is lower than a pre-defined threshold, has received significant attention recently. In this work, we propose a metric to measure network resilience on guaranteeing the pre-defined performance constraint. This metric is investig...
Graph mining plays a pivotal role across a number of disciplines, and a variety of algorithms have been developed to answer who/what type questions. For example, what items shall we recommend to a given user on an e-commerce platform? The answers to such questions are typically returned in the form of a ranked list, and graph-based ranking methods...
Information can propagate among Online Social Network (OSN) users at a high speed, which makes the OSNs important platforms for viral marketing. Although the viral marketing related problems in OSNs have been extensively studied in the past decade, the existing works all assume known propagation rates. In this paper, we propose a novel model, Dynam...
A major challenge in blockchain sharding protocols is that more than 95% transactions are cross-shard. Not only those cross-shard transactions degrade the system throughput but also double the confirmation time, and exhaust an already scarce network bandwidth. Are cross-shard transactions imminent for sharding schemes? In this paper, we propose a n...
Since 2016, sharding has become an auspicious solution to tackle the scalability issue in legacy blockchain systems. Despite its potential to strongly boost the blockchain throughput, sharding comes with its own security issues. To ease the process of deciding which shard to place transactions, existing sharding protocols use a hash-based transacti...
Although the iterative double auction has been widely used in many different applications, one of the major problems in its current implementations is that they rely on a trusted third party to handle the auction process. This imposes the risk of single point of failures, monopoly, and bribery. In this paper, we aim to tackle this problem by propos...
In this paper, we aim to develop a scalable algorithm to preserve differential privacy (DP) in adversarial learning for deep neural networks (DNNs), with certified robustness to adversarial examples. By leveraging the sequential composition theory in DP, we randomize both input and latent spaces to strengthen our certified robustness bounds. To add...
How strong are the connections between individuals? This is a fundamental question in the study of social networks. In this work, we take a topological view rooted in the idea of local sparsity to answer this question on large social networks to which we have only incomplete access. Prior approaches to measuring network structure are not applicable...
Social Networks (SNs) have been gradually applied by utility companies as an addition to smart grid and are proved to be helpful in smoothing load curves and reducing energy usage. However, SNs also bring in new threats to smart grid: misinformation in SNs may cause smart grid users to alter their demand, resulting in transmission line overloading...
Current implementations of iterative double auction rely on a trusted third-party to handle the auction process. This imposes the risk of single point of failures and monopoly. We tackle this problem by proposing a novel decentralized and trustless blockchain-based framework for iterative double auction. Our design extends the state channel technol...
The smart grid incentivizes distributed agents with local generation (e.g., smart homes, and microgrids) to establish multi-agent systems for enhanced reliability and energy consumption efficiency. Distributed energy trading has emerged as one of the most important multi-agent systems on the power grid by enabling agents to sell their excessive loc...
The current deep learning works on metaphor detection have only considered this task independently, ignoring the useful knowledge from the related tasks and knowledge resources. In this work, we introduce two novel mechanisms to improve the performance of the deep learning models for metaphor detection. The first mechanism employs graph convolution...
Relation Extraction (RE) is one of the fundamental tasks in Information Extraction. The goal of this task is to find the semantic relations between entity mentions in text. It has been shown in many previous work that the structure of the sentences (i.e., dependency trees) can provide important information/features for the RE models. However, the c...