
Dhruba K Bhattacharyya- PhD in Computer Science and Engineering
- Professor (Full) at Tezpur University
Dhruba K Bhattacharyya
- PhD in Computer Science and Engineering
- Professor (Full) at Tezpur University
Working towards development of a low-cost, non-invasive solution for CVD detection with minimum false alarms.
About
365
Publications
171,883
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7,688
Citations
Introduction
Machine learning enabled (I) malware defense development and (ii) biomarker identification using differential and co-expression analysis are my high priority research areas.
Current institution
Additional affiliations
March 1995 - March 1999
March 2004 - July 2017
March 1999 - July 2004
Publications
Publications (365)
Hepatobiliary cancers (HBCs) pose a major global health challenge, with a lack of effective targeted biomarkers. Due to their complex anatomical locations, shared risk factors, and the limitations of targeted therapies, generalized treatment strategies are often used for gallbladder cancer (GBC), hepatocellular carcinoma (HCC), and intrahepatic cho...
Cardiovascular diseases (CVDs) are the primary cause of mortality worldwide. The healthcare sector in India currently shows promise for substantial changes, specifically in the utilization and importance of the Internet of Medical Things (IoMT). Edge computing is necessary to make the IoMT more scalable, portable, reliable, and responsive. Security...
To digitize and preserve the cultural heritage in the form of Indian classical dance become apparent area of research. Sattriya classical dance of North-East India (Assam) is one of the eight Indian classical dance forms that requires immediate preservation. Sattriya classical dance consists of 29 Asamyukta hastas (single-hand gestures) and 14 Samy...
Supervised learning algorithms are effective in most application domains, however they have limitations. A single learning model may miss out on some local regions of the feature space, impacting overall performance. Ensemble learning techniques can be helpful here as they bring together a diverse set of learners, ensuring that even if one misses a...
One of the integral part of the network analysis is finding groups of nodes that exhibit similar properties. Community detection techniques are a popular choice to find such groups or communities within a network and it relies on graph-based methods to achieve this goal. Finding communities in biological networks such as gene co-expression networks...
Single-cell RNA sequencing (scRNA-Seq) technology provides the scope to gain insight into the interplay between intrinsic cellular processes as well as transcriptional and behavioral changes in gene–gene interactions across varying conditions. The high level of scarcity of scRNA-seq data, however, poses a significant challenge for analysis. We prop...
The development of statistically and biologically competent Community Detection Algorithm (CDA) is essential for extracting hidden information from massive biological datasets. This study introduces a novel community index as well as a CDA based on the newly introduced community index. To validate the effectiveness and robustness of the communities...
There are two types of distributed denial of service (DDoS) attacks: high-rate DDoS (HRDDoS) attacks and low-rate DDoS (LRDDoS) attacks. A shrew attack is an LRDDoS attack that can prove to be more harmful than an HRDDoS attack since it is not easily noticeable and is stealthy. It can cause TCP flows to attain near-zero throughput by sending attack...
Electronic medical records are a patient's digital asset that enhances the information available to doctors for tracking their patients' health. When this information is stored in a secure environment, health examination reports can serve as a dependable repository for thorough observation of a patient's well-being. However, it is crucial for the o...
Long non-coding RNAs (lncRNAs) play crucial roles in the regulation of gene expression and maintenance of genomic integrity through various interactions with DNA, RNA, and proteins. The availability of large-scale sequence data from various high-throughput platforms has opened possibilities to identify, predict, and functionally annotate lncRNAs. A...
Bipolar Disorder (BPD) and Schizophrenia (SCZ) are complex psychiatric disorders with shared symptomatology and genetic risk factors. Understanding the molecular mechanisms underlying these disorders is crucial for refining diagnostic criteria and guiding targeted treatments. In this study, publicly available RNA-seq data from post-mortem samples o...
Malware detection has become a critical aspect of ensuring the security and integrity of computer systems. With the ever-evolving landscape of malicious software, developing effective detection methods is of utmost importance. This study focuses on the identification of important features for malware detection methods, aiming to enhance the accurac...
The rise of technology has resulted in the evolution of data generation at a rapid speed. With the high increase in the volume of data, it has become necessary to store it using reliable and scalable data management systems. Blockchain offers a storage structure that ensures the security and reliability of the stored data. Smart contracts further e...
An edited collection of high quality research articles on Algorithm design in various application domains.
This book presents essential theoretical concepts in data science along with applications. The book discusses topics that are necessary for a data scientist to extract useful information, given voluminous raw data. The book starts with a chapter on generation of data, and preprocessing of raw data before further analysis. Several chapters of this b...
The understanding of the external characteristics of objects that need to be grasped is crucial for enhancing the dexterity of a robotic hand. Utilizing ontology-based knowledge representation (KR) approaches in the field of grasping presents novel opportunities for designing effective object recognition modules. This research paper proposes the de...
The process of community detection in a network uncovers groups of closely connected nodes, known as communities. In the context of gene correlation networks and neuro-degenerative diseases, this study introduces a systematic pipeline for centrality-based community detection using scRNA-Seq data. Comparisons with existing methods demonstrate its su...
This paper analyses the difference between parental cells and cells that acquired radioresistance using scRNA-seq data and investigates the dynamic changes of the transcriptome of cells in response to fractionated irradiation (FIR) towards the identification of potential biomarkers for Esophageal Squamous Cell Carcinoma (ESCC). The divergence of ge...
IoT networks are increasingly being connected to a wide range of devices, and the number of devices connected has significantly increased in recent years. As a consequence, the number of vulnerabilities to IoT networks has also been increasing tremendously. In IoT networks, botnet‐based Distributed Denial of Service attack is challenging due to its...
Feature selection (FS) is the problem of finding the most informative features that lead to optimal classification accuracy. In high-dimensional data classification, FS can save a significant amount of computation time as well as can help improve classification accuracy. An important issue in many applications is handling the situation where new in...
This article proposes a hybrid classifier for hyperspectral image(HSI) integrating the merits of two prominent classifiers: convolution neural network(CNN) and ensemble learning method. Both of them have evidence of efficient recognition capability of finding patterns from data. The ensemble model performs recognition using the salient features ext...
Nowadays, most of the major industries, such as healthcare, are losing millions of valuable data and information, and therefore, many of these major industries have implemented blockchain technology in order to save and secure their valuable data since blockchain's major feature is to store information in an immutable and permanent manner. Blockcha...
Blockchain is a composite technology that combines cryptography and consensus algorithms to solve traditional distributed database synchronization problem. Due to the features like immutability and traceability, blockchain is considered to be a reliable platform to store shared information. It is an integral part of various modern multi-field infra...
TUNADROMD dataset contains 4465 instances and 241 attributes. The target attribute for classification is a category (malware vs goodware). (N.B. This is the preprocessed version of TUANDROMD)
Delineation of Gallbladder (GB) and identification of gallstones from Computed Tomography (CT) and Ultrasonography (USG) images is an essential step in the radiomic analysis of Gallbladder Cancer (GBC). In this study, we devise a method for effective segmentation of GB from 2D CT images and Gallstones from USG images, by introducing a Rough Density...
Feature selection (FS) is a common preprocessing step of machine learning that selects informative subset of features which fuels a model to perform better during prediction or classification. It helps in the design of an intelligent and expert system used in computer vision, image processing, gene expression data analysis, intrusion detection and...
This paper presents a consensus-based approach that incorporates three microarray and three RNA-Seq methods for unbiased and integrative identification of differentially expressed genes (DEGs) as potential biomarkers for critical disease(s). The proposed method performs satisfactorily on two microarray datasets (GSE20347 and GSE23400) and one RNA-S...
Recent years have seen rise in applications of differential co-expression analysis (DCE) for disease biomarker identification. This paper presents a centrality-based hub-gene centric method called Centrality Based Differential Co-Expression Method (CBDCEM), for crucial gene finding for critical diseases. A prominent task of DCE is the identificatio...
Recently spectral-spatial information based algorithms are gaining more attention because of its robustness, accuracy and efficiency. In this paper, an SVM based classification method has been proposed which extracts features considering both spectral and spatial information. The proposed method exploits SVM to encode spectral-spatial information o...
Clustering unleashes the power of scRNA-seq through identification of appropriate cell groups. Most existing clustering methods applied on or developed for scRNA-seq data require user inputs. A few also require rigorous external preprocessing. In this paper, we propose an effective clustering method, which integrates required preprocessing steps fo...
A satellite image is a remotely sensed image data, where each pixel represents a specific location on earth. The pixel value recorded is the reflection radiation from the earth's surface at that location. Multispectral images are those that capture image data at specific frequencies across the electromagnetic spectrum as compared to Panchromatic im...
Recent history is not that generous and kind when it comes to viral infections from animal reservoirs to target humans. Re-emergence of mutating strains of such virus has only added more misery. In 2019, SARS-CoV2 had a daunting presence in and around the world, pausing grave threats to the perspective of global health, economy, livelihood and huma...
Feature selection help select an optimal subset of features from a large feature space to achieve better classification performance. The performance of KNN classifier can be improved significantly using an appropriate subset of features from a large feature space. Recent development in General Purpose Graphics Processing Units (GPGPU) has provided...
With the advancements in modern information and control systems, a new generation of systems has emerged, featuring a combination of independently developed cyber and physical processes. These systems are called cyber physical systems (CPSs). CPSs are composed of various interacting elements that monitor and control the physical processes through a...
An extensive empirical study is presented in this work to identify potential biomarkers of ESCC by employing fifteen prominent biclustering algorithms on synthetic and real datasets. For systematic analyses, we implement the algorithms on a variety of synthetic datasets and evaluate the quality of biclusters using recovery and relevance scores. The...
The challenge of identifying modules in a gene interaction network is important for a better understanding of the overall network architecture. In this work, we develop a novel similarity measure called Scaling-and-Shifting Normalized Mean Residue Similarity (SNMRS), based on the existing NMRS technique [1]. SNMRS yields correlation values in the r...
Background
A limitation of traditional differential expression analysis on small datasets involves the possibility of false positives and false negatives due to sample variation. Considering the recent advances in deep learning (DL) based models, we wanted to expand the state-of-the-art in disease biomarker prediction from RNA-seq data using DL. Ho...
Jitani, N.Singha, B.Barman, G.Talukdar, A.Choudhury, B. K.Sarmah, R.Bhattacharyya, D. K.Thresholding is one of the most widely used techniques for image segmentation, particularly, for medical image segmentation. The key idea is the selection of an appropriate intensity value to differentiate the background pixels from the object of interest pixels...
Exploratory analysis of high throughput gene sample time (GST) data has an impact in biomedical and bioinformatics research. Mining gene expression pattern in such three dimensional data facilitate in understanding hidden biological knowledge as well as underlying complex gene regulatory mechanism. In particular, we propose a novel semi-supervised...
Hepatobiliary cancers (HBCs) are the most aggressive and sixth most diagnosed cancers globally. Biomarkers for timely diagnosis and targeted therapy in HBCs are still limited. Considering the gap, our objective is to identify unique and overlapping molecular signatures associated with HBCs. We analyzed publicly available transcriptomic datasets on...
scRNA-seq data analysis enables new possibilities for identification of novel cells, specific characterization of known cells and study of cell heterogeneity. The performance of most clustering methods especially developed for scRNA-seq is greatly influenced by user input. We propose a centrality-clustering method named UICPC and compare its perfor...
A number of methods are being developed and used for analysis of gene expression data such as RNA-Seq data. Most of these tools focus on finding genes that are responsible for the disease conditions. Methods such as co-expression network generation, module detection and differential co-expression analysis are used to look into specific changes in t...
Gallbladder cancer (GBC) has a lower incidence rate among the population relative to other cancer types but is a major contributor to the total number of biliary tract system cancer cases. GBC is distinguished from other malignancies by its high mortality, marked geographical variation and poor prognosis. To date no systemic targeted therapy is ava...
Hyperspectral sensor generates huge datasets which conveys abundance of information. However, it poses many challenges in the analysis and interpretation of these data. Deep networks like VGG16, VGG19 are difficult to directly apply for hyperspectral image (HSI) classification because of its higher number of layers which in turn requires high level...
Density-based clustering has the ability to detect arbitrary shaped clusters in any dataset. In recent years, several density peak clustering methods have been reported. Among these, a few need user input(s), but majority use cluster validity indices to provide the best results. In this paper, we propose a density-based user-input-free clustering m...
To promote diligent analysis of the progression of a disease, it is important to identify interesting biomarkers
for the disease. Biclustering has already been established as an effective technique to help identify such
biomarkers of high biological significance. Although in the recent past, a good number of biclustering techniques have been introd...
K-nearest neighbor (k-nn) is a widely used classifier in machine learning and data mining, and is very simple to implement. The k-nn classifier predicts the class label of an unknown object based on the majority of the computed class labels of its k nearest neighbors. The prediction accuracy of the k-nn classifier depends on the user input value of...
Effective biomarkers aid in the early diagnosis and monitoring of breast cancer and thus play an important role in the treatment of patients suffering from the disease. Growing evidence indicates that alteration of expression levels of miRNA is one of the principal causes of cancer. We analyze breast cancer miRNA data to discover a list of bicluste...
Gallbladder cancer (GBC) has a lower incidence rate among the population relative to other cancer types but majorly contributes to the total cancer cases of the biliary tract system. GBC is distinguished from other malignancies due to its high mortality, marked geographical variation and poor prognosis. To date no systemic targeted therapy is avail...
In near future, Internet is predicted to be on the cloud, resulting in more complex and more intensive computing, but possibly also a more insecure digital world. The presence of a large number of resources organized densely is a factor in attracting DDoS attacks. Such attacks are arguably more dangerous in private or individual clouds with limited...
Biclustering has already been established as an effectiveSaikia, Manaswita tool to study gene expression data toward interesting biomarker findings for a given disease. This paper examines the effectiveness of some prominent biclustering algorithms in extracting biclusters of highBhattacharyya, Dhruba K. biological significance toward the identific...
Parkinson’s disease (PD) is one of the most common neurodegenerative disorders. This aging-related disease occurs due to the degenerative loss of tissue or cellular functions in the brain and due to genetic and epigenetic effects. This study was conducted on an RNA-seq dataset of PD collected from BA9 tissues to get insights to PD. A few RNA-seq ba...
Genes act in groups known as gene modules, which accomplish different cellular functions in the body. The modular nature of gene networks was used in this study to detect functionally enriched modules in samples obtained from COPD patients. We analyzed modules extracted from COPD samples and identified crucial genes associated with the disease COVI...
Genes are the backboneSharma, Pooja ofPandey, Anuj K. livingBhattacharyya, Dhruba K. bodies. GeneKalita, Jugal K. modules are nothing but group of genes responsible for carrying out various life-supporting functions in the body. However, any disruption in the activity of genes leads to an imbalance in the body referred to as diseased condition. Par...
To classify sensor data correctly and quickly has a very sound impact on areas such as performance monitoring, user behavior analysis, and user accounting and intrusion detection in IoT (Internet of things). This work is an approach to reorganize the features in a dataset of 114 features depending on the relevancy and non-redundancy of an attribute...
In recent years, ransomware has emerged as a new malware epidemic that creates havoc on the Internet. It infiltrates a victim system or network and encrypts all personal files or the whole system using a variety of encryption techniques. Such techniques prevent users from accessing files or the system until the required amount of ransom is paid. In...
In this paper, a clean comparison among different ensemble learning methods for hyperspectral image(HSI) classification have been investigated. Random Forest(RF), eXtreme Gradient Boosting(XGBoost) and adaboost have been exploited which extract both spectral and spatial features. Adaboost has been used with two base learners: decision tree(DT) and...
With the rapid growth of technology and IT-enabled
services, the potential damage caused by malware is increasing
rapidly. A large number of detection methods have been proposed
to arrest the growth of malware attacks. The performance of
these detection methods is usually established using raw or
feature datasets. The non-availability of adequate d...
This paper introduces an enhanced version of Pearson’s correlation coefficient (PCC) to achieve better biclustering-enabled co-expression analysis. The modified measure called local pearson correlation measure (LPCM) helps detect shifting, scaling, and shifting-and-scaling correlation patterns effectively over gene expression data in the presence o...