Srinivasan Parthasarathy's research while affiliated with The Ohio State University and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (197)
Ligand-based drug design aims to identify novel drug candidates of similar shapes with known active molecules. In this paper, we formulated an in silico shape-conditioned molecule generation problem to generate 3D molecule structures conditioned on the shape of a given molecule. To address this problem, we developed a translation- and rotation-equi...
Graph convolutional networks (GCNs) have achieved huge success in several machine learning (ML) tasks on graph-structured data. Recently, several sampling techniques have been proposed for the efficient training of GCNs and to improve the performance of GCNs on ML tasks. Specifically, the subgraph-based sampling approaches such as ClusterGCN and Gr...
Graph Convolutional Networks (GCN) can efficiently integrate graph structure and node features to learn high-quality node embeddings. At Pinterest, we have developed and deployed PinSage, a data-efficient GCN that learns pin embeddings from the Pin-Board graph. Pinterest relies heavily on PinSage which in turn only leverages the Pin-Board graph. Ho...
Graph representation learning models have been deployed for making decisions in multiple high-stakes scenarios. It is therefore critical to ensure that these models are fair. Prior research has shown that graph neural networks can inherit and reinforce the bias present in graph data. Researchers have begun to examine ways to mitigate the bias in su...
Sequential recommendation aims to recommend the next item of users' interest based on their historical interactions. Recently, the self-attention mechanism has been adapted for sequential recommendation, and demonstrated state-of-the-art performance. However, in this manuscript, we show that the self-attention-based sequential recommendation method...
Adaptive optics imaging has enabled the enhanced in vivo retinal visualization of individual cone and rod photoreceptors. Effective analysis of such high-resolution, feature rich images requires automated, robust algorithms. This paper describes RC-UPerNet, a novel deep learning algorithm, for identifying both types of photoreceptors, and was evalu...
In recent years, we have seen the success of network representation learning (NRL) methods in diverse domains ranging from computational chemistry to drug discovery and from social network analysis to bioinformatics algorithms. However, each such NRL method is typically prototyped in a programming environment familiar to the developer. Moreover, su...
As Named Entity Recognition (NER) has been essential in identifying critical elements of unstructured content, generic NER tools remain limited in recognizing entities specific to a domain, such as drug use and public health. For such high-impact areas, accurately capturing relevant entities at a more granular level is critical, as this information...
Session-based recommendation aims to generate recommendations for the next item of users' interest based on a given session. In this manuscript, we develop prospective preference enhanced mixed attentive model (P2MAM) to generate session-based recommendations using two important factors: temporal patterns and estimates of users' prospective prefere...
Graph Convolutional Networks (GCN) can efficiently integrate graph structure and node features to learn high-quality node embeddings. These embeddings can then be used for several tasks such as recommendation and search. At Pinterest, we have developed and deployed PinSage, a data-efficient GCN that learns pin embeddings from the Pin-Board graph. T...
Human mobility analysis plays a crucial role in urban analysis, city planning, epidemic modeling, and even understanding neighborhood effects on individuals’ health. Often, these studies model human mobility in the form of co-location networks. We have recently seen the tremendous success of network representation learning models on several machine...
Recent work uses a Siamese Network, initialized with BioWordVec embeddings (distributed word embeddings), for predicting synonymy among biomedical terms to automate a part of the UMLS (Unified Medical Language System) Metathesaurus construction process. We evaluate the use of contextualized word embeddings extracted from nine different biomedical B...
The UMLS Metathesaurus integrates more than 200 biomedical source vocabularies. During the Metathesaurus construction process, synonymous terms are clustered into concepts by human editors, assisted by lexical similarity algorithms. This process is error-prone and time-consuming. Recently, a deep learning model (LexLM) has been developed for the UM...
The Unified Medical Language System (UMLS) Metathesaurus construction process mainly relies on lexical algorithms and manual expert curation for integrating over 200 biomedical vocabularies. A lexical-based learning model (LexLM) was developed to predict synonymy among Metathesaurus terms and largely outperforms a rule-based approach (RBA) that app...
Many real-world applications deal with data that have an underlying graph structure associated with it. To perform downstream analysis on such data, it is crucial to capture relational information of nodes over their expanded neighborhood efficiently. Herein, we focus on the problem of Collective Classification (CC) for assigning labels to unlabele...
As machine learning becomes more widely adopted across domains, it is critical that researchers and ML engineers think about the inherent biases in the data that may be perpetuated by the model. Recently, many studies have shown that such biases are also imbibed in Graph Neural Network (GNN) models if the input graph is biased. In this work, we aim...
Next-basket recommendation considers the problem of recommending a set of items into the next basket that users will purchase as a whole. In this paper, we develop a novel mixed model with preferences, popularities and transitions (M<sup>2</sup>) for the next-basket recommendation. This method models three important factors in next-basket generatio...
Molecule optimization is a critical step in drug development to improve the desired properties of drug candidates through chemical modification. We have developed a novel deep generative model, Modof, over molecular graphs for molecule optimization. Modof modifies a given molecule through the prediction of a single site of disconnection at the mole...
The study of human cognition and the study of artificial intelligence (AI) have a symbiotic relationship, with advancements in one field often informing or creating new work in the other. Human cognition has many capabilities modern AI systems cannot compete with. One such capability is the detection, identification, and resolution of knowledge gap...
Due to delay in reporting, the daily national and statewide COVID-19 incidence counts are often unreliable and need to be estimated from recent data. This process is known in economics as nowcasting. We describe in this paper a simple random forest statistical model for nowcasting the COVID - 19 daily new infection counts based on historic data alo...
Multiplex networks are complex graph structures in which a set of entities are connected to each other via multiple types of relations, each relation representing a distinct layer. Such graphs are used to investigate many complex biological, social, and technological systems. In this work, we present a novel semi-supervised approach for structure-a...
Existing work on making privacy policies accessible has explored new presentation forms such as color-coding based on the risk factors or summarization to assist users with conscious agreement. To facilitate a more personalized interaction with the policies, in this work, we propose an automated privacy policy question answering assistant that extr...
The current UMLS (Unified Medical Language System) Metathesaurus construction process for integrating over 200 biomedical source vocabularies is expensive and error-prone as it relies on the lexical algorithms and human editors for deciding if the two biomedical terms are synonymous. Recent advances in Natural Language Processing such as Transforme...
Accurately discovering user intents from their written or spoken language plays a critical role in natural language understanding and automated dialog response. Most existing research models this as a classification task with a single intent label per utterance. Going beyond this formulation, we define and investigate a new problem of open intent d...
In many applications such as recidivism prediction, facility inspection, and benefit assignment, it's important for individuals to know the decision-relevant information for the model's prediction. In addition, the model's predictions should be fairly justified. Essentially, decision-relevant features should provide sufficient information for the p...
In the last several years, AI/ML technologies have become pervasive in academia and industry, finding its utility in newer and challenging applications.
Due to delay in reporting, the daily national and statewide COVID-19 incidence counts are often unreliable and need to be estimated from recent data. This process is known in economics as nowcasting. We describe in this paper a simple random forest statistical model for nowcasting the COVID - 19 daily new infection counts based on historic data alo...
Darknet market forums are frequently used to exchange illegal goods and services between parties who use encryption to conceal their identities. The Tor network is used to host these markets, which guarantees additional anonymization from IP and location tracking, making it challenging to link across malicious users using multiple accounts (sybils)...
Opioid and substance misuse is rampant in the United States today, with the phenomenon known as the opioid crisis. The relationship between substance use and mental health has been extensively studied, with one possible relationship being substance misuse causes poor mental health. However, the lack of evidence on the relationship has resulted in o...
Identifying driving styles is the task of analyzing the behavior of drivers in order to capture variations that will serve to discriminate different drivers from each other. This task has become a prerequisite for a variety of applications, including usage-based insurance, driver coaching, driver action prediction, and even in designing autonomous...
A machine learning (ML) based framework for the reduced-order modeling (ROM) of high-speed fluid-structure interaction (FSI) is explored. Specifically, we consider a Mach 2 transitional shock wave boundary layer interaction over a flexible panel in order to construct a machine learning-based Galerkin-free ROM. The properties of the ROM are examined...
Sequential recommendation aims to identify and recommend the next few items for a user that the user is most likely to purchase/review, given the user's purchase/rating trajectories. It becomes an effective tool to help users select favorite items from a variety of options. In this manuscript, we developed hybrid associations models (HAM) to genera...
In drug discovery, molecule optimization is an important step in order to modify drug candidates into better ones in terms of desired drug properties. With the recent advance of Artificial Intelligence, this traditionally in vitro process has been increasingly facilitated by in silico approaches. We present an innovative in silico approach to compu...
Abstract Learning on graphs is a subject of great interest due to the abundance of relational data from real-world systems. Many of these systems involve higher-order interactions (super-dyadic) rather than mere pairwise (dyadic) relationships; examples of these are co-authorship, co-citation, and metabolic reaction networks. Such super-dyadic rela...
Companies' privacy policies are often skipped by the users as they are too long, verbose, and difficult to comprehend. Identifying the key privacy and security risk factors mentioned in these unilateral contracts and effectively incorporating them in a summary can assist users in making a more informed decision when asked to agree to the terms and...
Traditional relational databases contain a lot of latent semantic information that have largely remained untapped due to the difficulty involved in automatically extracting such information. Recent works have proposed unsupervised machine learning approaches to extract such hidden information by textifying the database columns and then projecting t...
Routing newly posted questions (a.k.a cold questions) to potential answerers with suitable expertise in Community Question Answering sites (CQAs) is an important and challenging task. The existing methods either focus only on embedding the graph structural information and are less effective for newly posted questions, or adopt manually engineered f...
Many real-world systems involve higher-order interactions and thus demand complex models such as hypergraphs. For instance, a research article could have multiple collaborating authors, and therefore the co-authorship network is best represented as a hypergraph. In this work, we focus on the problem of hyperedge prediction. This problem has immense...
Increasingly, critical decisions in public policy, governance, and business strategy rely on a deeper understanding of the needs and opinions of constituent members (e.g. citizens, shareholders). While it has become easier to collect a large number of opinions on a topic, there is a necessity for automated tools to help navigate the space of opinio...
Visual Question Answering (VQA) systems are tasked with answering natural language questions corresponding to a presented image. Current VQA datasets typically contain questions related to the spatial information of objects, object attributes, or general scene questions. Recently, researchers have recognized the need for improving the balance of su...
Next-basket recommendation considers the problem of recommending a set of items into the next basket that users will purchase as a whole. In this paper, we develop a new mixed model with preferences and hybrid transitions for the next-basket recommendation problem. This method explicitly models three important factors: 1) users' general preferences...
We developed a hybrid associations model (HAM) to generate sequential recommendations using two factors: 1) users' long-term preferences and 2) sequential, both high-order and low-order association patterns in the users' most recent purchases/ratings. HAM uses simplistic pooling to represent a set of items in the associations. We compare HAM with t...
Summarizing a document within an allocated budget while maintaining its major concepts is a challenging task. If the budget can take any arbitrary value and not known beforehand, it becomes even more difficult. Most of the existing methods for abstractive summarization, including state-of-the-art neural networks are data intensive. If the number of...
Increasingly, critical decisions in public policy, governance, and business strategy rely on a deeper understanding of the needs and opinions of constituent members (e.g. citizens, shareholders). While it has become easier to collect a large number of opinions on a topic, there is a necessity for automated tools to help navigate the space of opinio...
Many real-world systems consist of entities that exhibit complex group interactions rather than simple pairwise relationships; such multi-way relations are more suitably modeled using hypergraphs. In this work, we generalize the framework of modularity maximization, commonly used for community detection on graphs, for the hypergraph clustering prob...
Sociologists associate the spatial variation of crime within an urban setting, with the concept of collective efficacy. The collective efficacy of a neighborhood is defined as social cohesion among neighbors combined with their willingness to intervene on behalf of the common good. Sociologists measure collective efficacy by conducting survey studi...
Routing newly posted questions (a.k.a cold questions) to potential answerers with the suitable expertise in Community Question Answering sites (CQAs) is an important and challenging task. The existing methods either focus only on embedding the graph structural information and are less effective for newly posted questions, or adopt manually engineer...
Sociologists associate the spatial variation of crime within an urban setting, with the concept of collective efficacy. The collective efficacy of a neighborhood is defined as social cohesion among neighbors combined with their willingness to intervene on behalf of the common good. Sociologists measure collective efficacy by conducting survey studi...
Reducing traffic accidents is an important public safety challenge, therefore, accident analysis and prediction has been a topic of much research over the past few decades. Using small-scale datasets with limited coverage, being dependent on extensive set of data, and being not applicable for real-time purposes are the important shortcomings of the...
Efficiently finding small samples with high diversity from large graphs has many practical applications such as community detection and online survey. This paper proposes a novel scalable node sampling algorithm for large graphs that can achieve better spread or diversity across communities intrinsic to the graph without requiring any costly pre-pr...
Motivation:
Graph embedding learning which aims to automatically learn low-dimensional node representations, has drawn increasing attention in recent years. To date, most recent graph embedding methods are evaluated on social and information networks and are not comprehensively studied on biomedical networks under systematic experiments and analys...
Table is a popular data format to organize and present relational information. Users often have to manually compose tables when gathering their desiderate information (e.g., entities and their attributes) for decision making. In this work, we propose to resolve a new type of heterogeneous query viz: tabular query, which contains a natural language...
Reducing traffic accidents is an important public safety challenge, therefore, accident analysis and prediction has been a topic of much research over the past few decades. Using small-scale datasets with limited coverage, being dependent on extensive set of data, and being not applicable for real-time purposes are the important shortcomings of the...
Pattern discovery in geo-spatiotemporal data (such as traffic and weather data) is about finding patterns of collocation, co-occurrence, cascading, or cause and effect between geospatial entities. Using simplistic definitions of spatiotemporal neighborhood (a common characteristic of the existing general-purpose frameworks) is not semantically repr...
Natural disasters such as floods, forest fires, and hurricanes can cause catastrophic damage to human life and infrastructure. We focus on response to hurricanes caused by both river water flooding and storm surge. Using models for storm surge simulation and flood extent prediction, we generate forecasts about areas likely to be highly affected by...
Directed graphs have been widely used in Community Question Answering services (CQAs) to model asymmetric relationships among different types of nodes in CQA graphs, e.g., question, answer, user. Asymmetric transitivity is an essential property of directed graphs, since it can play an important role in downstream graph inference and analysis. Quest...
In this paper, we propose to solve the directed graph embedding problem via a two stage approach: in the first stage, the graph is symmetrized in one of several possible ways, and in the second stage, the so-obtained symmetrized graph is embeded using any state-of-the-art (undirected) graph embedding algorithm. Note that it is not the objective of...
Scientific data analysis pipelines face scalability bottlenecks when processing massive datasets that consist of millions of small files. Such datasets commonly arise in domains as diverse as detecting supernovae and post-processing computational fluid dynamics simulations. Furthermore, applications often use inference frameworks such as TensorFlow...
In this paper we propose Fractal, a high performance and high productivity system for supporting distributed graph pattern mining (GPM) applications. Fractal employs a dynamic (auto-tuned) load-balancing based on a hierarchical and locality-aware work stealing mechanism, allowing the system to adapt to different workload characteristics. Additional...
Motivation: Graph embedding learning which aims to automatically learn low-dimensional node representations has drawn increasing attention in recent years. To date, most recent graph embedding methods are mainly evaluated on social and information networks and have yet to be comprehensively studied on biomedical networks under systematic experiment...
Reducing traffic accidents is an important public safety challenge. However, the majority of studies on traffic accident analysis and prediction have used small-scale datasets with limited coverage, which limits their impact and applicability; and existing large-scale datasets are either private, old, or do not include important contextual informat...
Dynamically extracting and representing continually evolving knowledge entities is an essential scaffold for grounded intelligence and decision making. Creating knowledge schemas for newly emerging, unfamiliar, domain-specific ideas or events poses the following challenges: (i) detecting relevant, often previously unknown concepts associated with t...