Vijay v RaghavanUniversity of Louisiana at Lafayette | ULL · The Centre for Advanced Computer Studies
Vijay v Raghavan
PhD, 1978, Univ. of Alberta
About
334
Publications
80,413
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,162
Citations
Introduction
I am the founding director of the NSF-funded Center Visual and Decision Informatics (CVDI). The CVDI value proposition is to offer a low cost, low risk venue for industry and government agencies to validate the early stage innovation with involvement of university faculty, post doctorate scholars and students.
http://nsfcvdi.org
Additional affiliations
July 1977 - June 1986
August 1986 - present
Publications
Publications (334)
We have developed an alignment-free TSR (Triangular Spatial Relationship)-based computational method for protein structural comparison and motif identification and discovery. To demonstrate the potential applications of the method, we have generated two datasets. One dataset contains five classes: Actin/Hsp70, serine protease (chymotrypsin/trypsin/...
Travel patterns and mobility affect the spread of infectious diseases like COVID-19. However, we
do not know to what extent local vs. visitor mobility affects the growth in the number of cases. This
study evaluates the impact of state-level local vs. visitor mobility in understanding the growth with
respect to the number of cases for COVID spread i...
Containing the COVID-19 pandemic while balancing the economy has proven to be quite a challenge for the world. We still have limited understanding of which combination of policies have been most effective in flattening the curve; given the challenges of the dynamic and evolving nature of the pandemic, lack of quality data etc. This paper introduces...
Background: Travel patterns of humans play a major part in the spread of infectious diseases. This was evident in the geographical spread of COVID-19 in the United States. However, the impact of this mobility and the transmission of the virus due to local travel, compared to the population traveling across state boundaries, is unknown. This study e...
Human mobility plays an important role in the dynamics of infectious disease spread. Evidence from the initial nationwide lockdowns for COVID− 19 indicates that restricting human mobility is an effective strategy to contain the spread. While a direct correlation was observed early on, it is not known how mobility impacted COVID− 19 infection growth...
Various time series forecasting methods have been successfully applied for the water-stage forecasting problem. Graphical time series models are a class of multivariate time series to model the spatio-temporal dependencies between the sensors. Constructing graph-based models involve data pre-processing and correlation analysis to capture the dynami...
Containing the COVID-19 pandemic while balancing the economy has proven to be quite a challenge for the world. We still have limited understanding of which combination of policies have been most effective in flattening the curve; given the challenges of the dynamic and evolving nature of the pandemic, lack of quality data etc. This paper introduces...
Development of protein 3-D structural comparison methods is essential for understanding protein functions. Some amino acids share structural similarities while others vary considerably. These structures determine the chemical and physical properties of amino acids. Grouping amino acids with similar structures potentially improve the ability to iden...
Development of protein 3-D structural comparison methods is important in understanding protein functions. At the same time, developing such a method is very challenging. In the last 40 years, ever since the development of the first automated structural method, ~200 papers were published using different representations of structures. The existing me...
Protein 3-D structures are more functionally conserved than sequence and this claims the need of developing a computational tool for accurate protein structure comparison at the global and local levels. We have developed a novel geometry-based method for protein 3-D structure comparison using the concept of Triangular Spatial Relationship (TSR). Ea...
Processing high-volume, high-velocity data streams is an important big data problem in many sciences, engineering, and technology domains. There are many open-source distributed stream processing and cloud platforms that offer low-latency stream processing at scale, but the visualization and user-interaction components of these systems are limited...
A major task in spatio-temporal outlier detection is to identify objects that exhibit abnormal behavior either spatially, and/or temporally. There have only been a few algorithms proposed for detecting spatial and/or temporal outliers. One example is the Local Density-Based Spatial Clustering of Applications with Noise (LDBSCAN). Density-Based Spat...
We provide data-driven machine learning methods that are capable of making real-time influenza forecasts that integrate the impacts of climatic factors and geographical proximity to achieve better forecasting performance. The key contributions of our approach are both applying deep learning methods and incorporation of environmental and spatio-temp...
Alzheimer’s disease is a major cause of dementia. Its pathology induces complex spatial patterns of brain atrophy that evolve as the disease progresses. The diagnosis requires accurate biomarkers that are sensitive to disease stages. Probabilistic biomarkers naturally support the interpretation of decisions and evaluation of uncertainty associated...
The k Nearest Neighbors (KNN) algorithm has been widely applied in various supervised learning tasks due to its simplicity and effectiveness. However, the quality of KNN decision making is directly affected by the quality of the neighborhoods in the modeling space. Efforts have been made to map data to a better feature space either implicitly with...
We study the problem of learning to rank from multiple sources. Though multi-view learning and learning to rank have been studied extensively leading to a wide range of applications, multi-view learning to rank as a synergy of both topics has received little attention. The aim of the paper is to propose a composite ranking method while keeping a cl...
In this chapter, we survey various deep learning techniques that are applied in the field of Natural Language Processing. We also propose methods for computing sentence embedding and document embedding. Both sentence embedding and document embedding are able to capture the distribution of hidden concepts in the corresponding sentence or document. T...
When analyzing streaming data, the results can depreciate in value faster than the analysis can be completed and results deployed. This is certainly the case in the area of anomaly detection, where detecting a potential problem as it is occurring (or in the early stages) can permit corrective behavior. However, most anomaly detection methods focus...
For the last decade, the automatic generation of hypothesis from the literature has been widely studied. One common approach is to model biomedical literature as a concept network; then a prediction model is applied to predict the future relationships (links) between pairs of concept. Typically, this link prediction task can be cast into in one of...
All criminal networks are social networks with multiple channels of communication and collaboration between their members. In this paper, we analyze different types of criminal networks with respect to metrics commonly used in social network analysis literature. We focus mostly on two types of networks: cocaine trading and terrorist activities. We...
Alzheimer's disease is a major cause of dementia. Its diagnosis requires accurate biomarkers that are sensitive to disease stages. In this respect, we regard probabilistic classification as a method of designing a probabilistic biomarker for disease staging. Probabilistic biomarkers naturally support the interpretation of decisions and evaluation o...
The Big Data and Data Analytics is a brand new paradigm, for the integration of Internet Technology in the human and machine context. For the first time in the history of the human mankind we are able to transforming raw data that are massively produced by humans and machines in to knowledge and wisdom capable of supporting smart decision making, i...
Information about events happening in the real world are generated online on social media in real-time. There is substantial research done to detect these events using information posted on websites like Twitter, Tumblr, and Instagram. The information posted depends on the type of platform the website relies upon, such as short messages, pictures,...
The recent rapid growth in Internet speeds and file storage requirements has made cloud storage an appealing option on both a personal and enterprise level. Despite the many benefits offered by cloud storage, many potential users with sensitive data refrain from fully utilizing this service due to valid concerns about information privacy. An establ...
In this study, we provide an overview of the state-of-the-art technologies in programming, computing, and storage of the massive data analytics landscape. We shed light on different types of analytics that can be performed on massive data. For that, we first provide a detailed taxonomy on different analytic types along with examples of each type. N...
Cognitive computing is a nascent interdisciplinary domain. It is a confluence of cognitive science, neuroscience, data science, and cloud computing. Cognitive science is the study of mind and offers theories, mathematical and computational models of human cognition. Cognitive science itself is an interdisciplinary domain and draws upon philosophy,...
Social Media generates information about news and events in real-time. Given the vast amount of data available and the rate of information propagation, reliably identifying events is a challenge. Most state-of-the-art techniques are post hoc techniques that detect an event after it happened. Our goal is to detect onset of an event as it is happenin...
Weighted graphs can be used to model any data sets composed of entities and relationships. Social networks, concept networks, and document networks are among the types of data that can be abstracted as weighted graphs. Identifying minimum-sized influential vertices (MIV) in a weighted graph is an important task in graph mining that gains valuable c...
Big data requirements are motivating new database management models that can process billions of data requests per second, and established relational models are changing to keep pace. The authors provide practical tools for navigating this shifting product landscape and finding candidate systems that best fit a data manager's application needs.
The recent emergence of a new class of systems for data management has challenged the well-entrenched relational databases. These systems provide several choices for data management under the umbrella term NoSQL. Making a right choice is critical to building applications that meet business needs. Performance, scalability and cost are the principal...
Decision makers in multiple domains are increasingly looking for ways to improve the understanding of real-world phenomena through data collected from Internet devices, including low-cost sensors, smart phones, and online activity. Examples include detecting environmental changes, understanding the impacts of adverse manmade and natural disasters,...
Several real-world observations from streaming data sources, such as sensors, click streams, and social media, can be modeled as time-evolving graphs. There is a lot of interest in domains such as cybersecurity, epidemiology networks, social community networks, and recommendation networks to both study and build systems to track the evolutionary pr...
Adverse drug events (ADEs) are among the leading causes of death in the United States. Although many ADEs are detected during pharmaceutical drug development and the FDA approval process, all of the possible reactions cannot be identified during this period. Currently, post-consumer drug surveillance relies on voluntary reporting systems, such as t...
Various embodiments provide a system, method, and computer program product for sorting and/or selectively retrieving a plurality of documents in response to a user query. More particularly, embodiments are provided that convert each document into a corresponding document language model and convert the user query into a corresponding query language...
Interoperability of annotations in different domains is an essential demand to facilitate the interchange of data between semantic applications. Foundational ontologies, such as SKOS (Simple Knowledge Organization System), play an important role in creating an interoperable layer for annotation. We are proposing a multi-layer ontology schema, named...
While the term Big Data is open to varying interpretation, it is quite clear that the Volume, Velocity, and Variety (3Vs) of data have impacted every aspect of computational science and its applications. The volume of data is increasing at a phenomenal rate and a majority of it is unstructured. With big data, the volume is so large that processing...
Due to the inherent complexity of natural languages, many natural language tasks are ill-posed for mathematically precise algorithmic solutions. To circumvent this problem, statistical machine learning approaches are used for NLP tasks. The emergence of Big Data enables a new paradigm for solving NLP problems — managing the complexity of the proble...
Despite some key problems, big data could fundamentally change scientific research methodology and how businesses develop products and provide services.
Connection subgraphs are small connected subgraphs of a larger graph that well capture the relationship between two or more query nodes. Running a connection subgraph algorithm on a large graph can take long amount of time. Therefore, a preprocessing step is required to generate the connection subgraphs in a much shorter amount of time. This paper...
This chapter describes a comprehensive granular model for decision making with complex data. This granular model first uses information decomposition to form a horizontal set of granules for each of the data instances. Each granule is a partial view of the corresponding data instance; and collectively all the partial views of that data instance pro...
The distribution of the number of items liked by users plays an important role in designing recommender systems. In case of implicit feedback, we rarely get many clicking events compared to large item based e-commerce sites, where preference information is not so rare. In this paper we
present a novel hybrid recommendation system based on clusterin...
This poster describes a project under development in which we propose a framework for automating the development of lightweight ontologies for semantic annotations. When considering building ontologies for annotations in any domain, we follow the process of ontology learning in Stelios 2006, but since we are looking for lightweight ontology, we onl...
We propose and evaluate an unsupervised approach to identify the location of a user purely based on tweet history of that user. We combine the location references from tweets of a user with gazetteers like DBPedia to identify the geolocation of that user at a city level. This can be used for location based personalization services like targeted adv...
Weighted graphs can be used to model any datasets composed of entities and relationships. Social networks, concept networks and document networks are among the types of data that can be abstracted as weighted graphs. Identifying Minimum-sized Influential Vertices (MIV) in a weighted graph is an important task in graph mining that gains valuable com...
The advent of Big Data created a need for out-of-the-box horizontal scalability for data management systems. This ushered in an array of choices for Big Data management under the umbrella term NoSQL. In this paper, we provide a taxonomy and unified perspective on NoSQL systems. Using this perspective, we compare and contrast various NoSQL systems u...
In a large-scale recommendation setting, item-based collaborative filtering is preferable due to the availability of huge number of users’ preference information and relative stability in item-item similarity. Item-based collaborative filtering only uses users’ items preference information to predict recommendation for targeted users. This process...
A method, interface, and apparatus for expressing data objects is described. A method for expressing information can comprise the steps of: extracting attributes of a plurality of data objects, wherein the attributes reflect information associated with the data objects; hierarchically grouping the data objects based on the attributes of the data ob...
A method, interface, and apparatus for expressing data objects is described. A method for expressing information can comprise the steps of: extracting attributes of a plurality of data objects, wherein the attributes reflect information associated with the data objects; hierarchically grouping the data objects based on the attributes of the data ob...
URL: http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=6690752
The 2013 IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT 2013) is held jointly with the 2013 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2013). The IAT 2013 and WI 2013 conferences are sponsored and co-organized by the IEEE Computer Society Technical Committee on Intelligent Informatics (TCII), Web Intel...
Recently, with companies and government agencies saving large repositories of time stream/temporal data, there is a large push for adapting association rule mining algorithms for dynamic, targeted querying. In addition, issues with data processing latency and results depreciating in value with the passage of time, create a need for swifter and more...
Voxel-based analysis of neuroimagery provides a promising source of information for early diagnosis of Alzheimer’s disease. However, neuroimaging procedures usually generate high-dimensional data. This complicates statistical analysis and modeling, resulting in high computational complexity and typically more complicated models. This study uses the...
One of the challenges of semantically annotating web documents is due to the lack of annotation standards and robust
ontologies for specific domains. Even those available are outdated due to the fact that ontologies change over time needing continual maintenance, and annotation regeneration. In this paper, we explore the problem of building a unifi...
Analyzing and classifying sequence data based on structural similarities and differences is a mathematical problem of escalating relevance. Indeed, a primary challenge in designing machine learning algorithms to analyzing sequence data is the extraction and representation of significant features. This paper introduces a generalized sequence feature...
Proceedings of the 2013 IEEE International Conference on Big Data, 6-9 October 2013, Santa Clara, CA, USA. IEEE 2013, ISBN 978-1-4799-1292-6
Clinical trials for interventions that seek to delay the onset of Alzheimer’s disease (AD) are hampered by inadequate methods for selecting study subjects who are at risk, and who may therefore benefit from the interventions being studied. Automated monitoring tools may facilitate clinical research and thereby reduce the impact of AD on individuals...
Voxel-based analysis of neuroimagery provides a promising source of information for early diagnosis of Alzheimer’s disease. However, neuroimaging procedures usually generate high-dimensional data. This complicates statistical analysis and modeling, resulting in high computational complexity and typically more complicated models. This study uses the...
Embodiments of the present invention are generally related to systems, methods, computer readable media, and other means for extracting entities, determining the semantic relationships among the entities and generating knowledge. More particularly, some embodiments of the present invention are directed to generating a hypothesis and/or gaining know...
Embodiments of the present invention are generally related to systems, methods, computer readable media, and other means for extracting entities, determining the semantic relationships among the entities and generating knowledge. More particularly, some embodiments of the present invention are directed to generating a hypothesis and/or gaining know...
Alzheimer's Disease (AD) is one major cause of dementia. Previous studies have indicated that the use of features derived from Positron Emission Tomography (PET) scans lead to more accurate and earlier diagnosis of AD, compared to the traditional approaches that use a combination of clinical assessments. In this study, we compare Naive Bayes (NB) w...
Alzheimer's disease (AD) is one major cause of dementia.
Previous studies have indicated that the use of features derived from
Positron Emission Tomography (PET) scans lead to more accurate and
earlier diagnosis of AD, compared to the traditional approaches that
use a combination of clinical assessments. In this study, we compare
Naive Bayes (NB) w...
The itemset tree data structure is used in targeted association mining to �find rules within a user's sphere of interest. In our earlier work, we proposed two enhancements to unordered itemset trees. The first enhancement consisted of sorting all nodes in lexical order based upon the itemsets they contain. In the second enhancement, called the Min-...
The goal of association mining is to find potentially interesting rules in large repositories of data. Unfortunately using a minimum support threshold, a standard practice to improve the association mining processing complexity, can allow some of these rules to remain hidden. This occurs because not all rules which have high confidence have a high...
Positron Emission Tomography scans are a promising source of information for early diagnosis of Alzheimer's disease. However, such neuroimaging procedures usually generate high-dimensional data. This complicates statistical analysis and modeling, resulting in high computational complexity and typically more complicated models. However, the utilizat...
Granular computing uses granules as basic units to compute with. Granules can be formed by either information abstraction or infor- mation decomposition. In this paper, we view information decomposition as a paradigm for processing data with complex structures. More specif- ically, we apply lossless information decomposition to protein sequence ana...
This article presents our proposal of using a random walk model based approach for quantifying technology emergence and impact for research articles based on a concept map extracted from related literature databases. The same approach should be easily adapted to citation networks and author networks.
A metasearch engine is an Information Retrieval (IR) system that can query multiple search engines and aggregate ranked list of results returned by them into a single result list of documents, ranked in descending order of relevance to a query. The result aggregation problem have been largely treated as Multi-Criteria Decision Making (MCDM) problem...
Computational approaches to generate hypotheses from biomedical literature have been studied intensively in recent years. Nevertheless, it still remains a challenge to automatically discover novel, cross-silo biomedical hypotheses from large-scale literature repositories. In order to address this challenge, we first model a biomedical literature re...
Action rules mining aims to provide recommendations to analysts seeking to achieve a specific change. An action rule is constructed as a series of changes, or actions, which can be made to some of the flexible characteristics of the information system that ultimately triggers a change in the targeted attribute. The existing action rules discovery m...
Mining large graphs to discover relationships between two or more nodes is an important problem. This paper presents a literature review on a specific formulation of that problem, which is referred to as the connection subgraph problem. Connection subgraphs are useful in many applications such as ranking search results, discovering connections betw...