Swagatam Das

Swagatam Das
Indian Statistical Institute | ISI · Electronics and Communication Sciences Unit (ECSU)

Doctor of Engineering

About

492
Publications
158,042
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
23,484
Citations
Additional affiliations
July 2011 - present
Indian Statistical Institute
Position
  • Professor (Assistant)
June 2006 - July 2011
Jadavpur University
Position
  • Professor (Assistant)

Publications

Publications (492)
Article
In practical situations, it is very often desirable to detect multiple optimally sustainable solutions of an optimization problem. The population-based evolutionary multimodal optimization algorithms can be very helpful in such cases. They detect and maintain multiple optimal solutions during the run by incorporating specialized niching operations...
Article
Full-text available
The performance of a clustering algorithm can be improved by assigning appropriate weights to different features of the data. Such feature weighting is likely to reduce the effect of noise and irrelevant features while enhancing the effect of the discriminative features simultaneously. For the clustering purpose, feature-weighted dissimilarity meas...
Article
Full-text available
Differential Evolution (DE) is currently one of the most competitive Evolutionary Algorithms (EAs) for optimization problems involving continuous parameters. This article presents three very simple modifications to the basic DE scheme such that its performance can be improved and made scalable for optimizing functions having a real-valued moderate-...
Article
Ambiguity in a dataset, characterized by data points having multiple target labels, may occur in many supervised learning applications. Such ambiguity originates naturally or from misinterpretation, faulty encoding, and/or incompleteness of data. However, most applications demand that a data point be assigned a single label. In such cases, the supe...
Preprint
Full-text available
The Gromov-Wasserstein (GW) distance is an effective measure of alignment between distributions supported on distinct ambient spaces. Calculating essentially the mutual departure from isometry, it has found vast usage in domain translation and network analysis. It has long been shown to be vulnerable to contamination in the underlying measures. All...
Article
Full-text available
Super-Resolution (SR) is a time-hallowed image processing problem that aims to improve the quality of a Low-Resolution (LR) sample up to the standard of its High-Resolution (HR) counterpart. We aim to address this by introducing Super-Resolution Generator (SuRGe), a fully-convolutional Generative Adversarial Network (GAN)-based architecture for SR....
Article
Capsule networks (CapsNets) aim to parse images into a hierarchy of objects, parts, and their relationships using a two-step process involving part–whole transformation and hierarchical component routing. However, this hierarchical relationship modeling is computationally expensive, which has limited the wider use of CapsNet despite its potential a...
Article
Full-text available
Supervised classification problems from the real world typically face a challenge characterized by the scarcity of samples in one or more target classes compared to the rest of the majority classes. In response to such class imbalance, we propose an oversampling technique based on clustering, aiming to populate the minority class with synthetic sam...
Article
Many-to-many voice conversion (VC) is a technique aimed at mapping speech features between multiple speakers during training and transferring the vocal characteristics of one source speaker to another target speaker, all while maintaining the content of the source speech unchanged. Existing research highlights a notable gap between the original and...
Conference Paper
Voice conversion (VC) is the speech-to-speech (STS) synthesis process that converts the vocal identity of a source speaker to a target speaker by keeping the linguistic content unaltered. In recent years, VC research has been explored using generative adversarial network (GAN) models. However, a substantial difference exists between the real and th...
Article
Graph Neural Networks (GNNs) have emerged as one of the most powerful approaches for learning on graph-structured data, even though they are mostly restricted to being shallow in nature. This is because node features tend to become indistinguishable when multiple layers are stacked together. This phenomenon is known as over-smoothing. This paper id...
Preprint
Full-text available
Evolutionary algorithms (EA), a class of stochastic search methods based on the principles of natural evolution, have received widespread acclaim for their exceptional performance in various real-world optimization problems. While researchers worldwide have proposed a wide variety of EAs, certain limitations remain, such as slow convergence speed a...
Article
Principal component analysis (PCA) is a fundamental tool for data visualization, denoising, and dimensionality reduction. It is widely popular in statistics, machine learning, computer vision, and related fields. However, PCA is well-known to fall prey to outliers and often fails to detect the true underlying low-dimensional structure within the da...
Chapter
Full-text available
In recent years, analyzing historical monuments using computer vision techniques has become extremely important for renovation, tourism experience, and preservation. While such studies are common for European architectural research, their findings are not directly applicable to Indian Monuments as the latter has significant differences in architect...
Article
The mean shift algorithm is a simple yet very effective clustering method widely used for image and video segmentation as well as other exploratory data analysis applications. Recently, a new algorithm called MeanShift++ (MS++) for low-dimensional clustering was proposed with a speedup of 4000 times over the vanilla mean shift. In this work, starti...
Chapter
We introduce the Hamiltonian Monte Carlo Particle Swarm Optimizer (HMC-PSO), an optimization algorithm that reaps the benefits of both Exponentially Averaged Momentum PSO and HMC sampling. The coupling of the position and velocity of each particle with Hamiltonian dynamics in the simulation allows for extensive freedom for exploration and exploitat...
Article
Full-text available
As the world moves towards industrialization, optimization problems become more challenging to solve in a reasonable time. More than 500 new metaheuristic algorithms (MAs) have been developed to date, with over 350 of them appearing in the last decade. The literature has grown significantly in recent years and should be thoroughly reviewed. In this...
Preprint
Full-text available
In this document, we use the available statistical techniques and certain methodologies to understand the characteristics of a card game in terms of “skill" vs. “chance". In particular, we consider certain kinds of card games here, for example “Rummy". Studies analyzing the impact of “skill" vs. “chance" in games have focused on evaluating the long...
Article
Full-text available
Digitized methodologies in the recent era contribute to various fields of automation that used to hold different interests and meanings of human life. Buildings with historical significance, cultural values, and beliefs are becoming an interdisciplinary field of interest, engaging more computer scientists nowadays. Such structures need more attenti...
Article
Full-text available
Recently, convolutional neural networks (CNNs) have shown promising achievements in various computer vision tasks. However, designing a CNN model architecture necessitates a high-domain knowledge expert, which can be difficult for new researchers while solving real-world problems like medical image diagnosis. Neural architecture search (NAS) is an...
Article
The problem of class imbalance has always been considered as a significant challenge to traditional machine learning and the emerging deep learning research communities. A classification problem can be considered as class imbalanced if the training set does not contain an equal number of labeled examples from all the classes. A classifier trained o...
Chapter
Due to the complex topology of the search space, expensive multi-objective evolutionary algorithms (EMOEAs) emphasize enhancing the exploration capability. Many algorithms use ensembles of surrogate models to boost the performance. Generally, the surrogate-based model either works out the solution’s fitness by approximating the evaluation function...
Article
Kernel $k$ -means clustering is a powerful tool for unsupervised learning of non-linearly separable data. Its merits are thoroughly validated on a suite of simulated datasets and real data benchmarks that feature nonlinear and multi-view separation. Since the earliest attempts, researchers have noted that such algorithms often become trapped by l...
Article
Artificial agents are used in autonomous systems such as autonomous vehicles, autonomous robotics, and autonomous drones to make predictions based on data generated by fusing the values from many sources such as different sensors. Malfunctioning of sensors was noticed in the robotics domain. The correct observation from sensors corresponds to the t...
Article
Most metaheuristic optimizers rely heavily on precisely setting their control parameters and search operators to perform well. Considering the complexity of real-world problems, it is always preferable to adjust control parameter values automatically rather than clamping them to a fixed value. In recent years, Spherical Search (SS) has emerged as a...
Preprint
Full-text available
Capsule networks (CapsNets) aim to parse images into a hierarchical component structure that consists of objects, parts, and their relations. Despite their potential, they are computationally expensive and pose a major drawback, which limits utilizing these networks efficiently on more complex datasets. The current CapsNet models only compare their...
Preprint
Full-text available
In machine learning and computer vision, mean shift (MS) qualifies as one of the most popular mode-seeking algorithms used for clustering and image segmentation. It iteratively moves each data point to the weighted mean of its neighborhood data points. The computational cost required to find the neighbors of each data point is quadratic to the numb...
Article
Full-text available
This paper presents an approach to solve the variant of the well-known Travelling Salesman Problem (TSP) by using a gamesourcing approach. In contemporary literature is TSP solved by wide spectra of modern as well as classical computational methods. We would like to point out the possibility to solve such problems by computer game plying that is ca...
Article
Clusters in real data are often restricted to low-dimensional subspaces rather than the entire feature space. Recent approaches to circumvent this difficulty are often computationally inefficient and lack theoretical justification in terms of their large-sample behavior. This article deals with the problem by introducing an entropy incentive term t...
Article
Image-to-image (I2I) translation has become a key asset for generative adversarial networks. Convolutional neural networks (CNNs), despite having a significant performance, are not able to capture the spatial relationships among different parts of an object and, thus, do not qualify as the ideal representative model for image translation tasks. As...
Preprint
Full-text available
We introduce the Hamiltonian Monte Carlo Particle Swarm Optimizer (HMC-PSO), an optimization algorithm that reaps the benefits of both Exponentially Averaged Momentum PSO and HMC sampling. The coupling of the position and velocity of each particle with Hamiltonian dynamics in the simulation allows for extensive freedom for exploration and exploitat...
Preprint
Few-shot learning aims to transfer the knowledge acquired from training on a diverse set of tasks, from a given task distribution, to generalize to unseen tasks, from the same distribution, with a limited amount of labeled data. The underlying requirement for effective few-shot generalization is to learn a good representation of the task manifold....
Article
Differential Evolution (DE) has been widely appraised as a simple yet robust population-based, non-convex optimization algorithm primarily designed for continuous optimization. Two important control parameters of DE are the scale factor F, which controls the amplitude of a perturbation step on the current solutions and the crossover rate Cr, which...
Chapter
In the field of population-based multi-objective optimization, a non-dominated sorting approach amounts to sort a set of candidate solutions with multiple objective function values, based on their dominance relations, and to find out solutions distributed into the first front set, second front set, and so on. A fast non-dominated sorting approach u...
Article
Full-text available
Energy disaggregation (ED), with minimal infrastructure, can create energy awareness and thus promote energy efficiency by providing appliance-level consumption information. However, ED is highly ill-posed and gets complicated with increase in number and type of devices, similarity between devices, measurement errors, etc. To design, test, and benc...
Article
The class of center-based clustering algorithms offers methods to efficiently identify clusters in data sets, making them applicable to larger data sets. While a data set may contain several features, not all of them may be equally informative or helpful towards cluster detection. Therefore, sparse center-based clustering methods offer a way to sel...
Preprint
The problem of linear predictions has been extensively studied for the past century under pretty generalized frameworks. Recent advances in the robust statistics literature allow us to analyze robust versions of classical linear models through the prism of Median of Means (MoM). Combining these approaches in a piecemeal way might lead to ad-hoc pro...
Book
This book gathers selected papers presented at the 6th International Conference on Artificial Intelligence and Evolutionary Computations in Engineering Systems, held at the Anna University, Chennai, India, from 20 to 22 April 2020. It covers advances and recent developments in various computational intelligence techniques, with an emphasis on the d...
Article
Voice Conversion (VC) emerged as a significant domain of research in the field of speech synthesis in recent years due to its emerging application in voice-assistive technology, automated movie dubbing, and speech-to-singing conversion to name a few. VC basically deals with the conversion of vocal style of one speaker to another speaker while keepi...
Article
Full-text available
In a world withstanding the waves of a raging pandemic, respiratory disease detection from chest radiological images using machine learning approaches has never been more important for a widely accessible and prompt initial diagnosis. A standard machine learning disease detection workflow that takes an image as input and provides a diagnosis in ret...
Article
cc is a practical and efficient evolutionary framework for solving lsgop. The performance of cc depends on how variables are being grouped and can be improved through guided variable decomposition for various optimization problems. However, achieving a proper variable decomposition is computationally expensive. This paper proposes an effective yet...
Article
Since the last three decades, numerous search strategies have been introduced within the framework of different evolutionary algorithms (EAs). Most of the popular search strategies operate on the hypercube (HC) search model, and search models based on other hypershapes, such as hyper-spherical (HS), are not investigated well yet. The recently devel...
Article
Full-text available
The development of a computer-aided disease detection system to ease the long and arduous manual diagnostic process is an emerging research interest. Living through the recent outbreak of the COVID-19 virus, we propose a machine learning and computer vision algorithms-based automatic diagnostic solution for detecting the COVID-19 infection. Our pro...
Article
This article introduces a version of the Self-Organizing Migrating Algorithm with a narrowing search space strategy named iSOMA. Compared to the previous two versions, SOMA T3A and Pareto that ranked 3rd and 5th respectively in the IEEE CEC (Congress on Evolutionary Computation) 2019 competition, the iSOMA is equipped with more advanced features wi...
Article
Random mechanisms including mutations are an internal part of evolutionary algorithms, which are based on the fundamental ideas of Darwin’s theory of evolution as well as Mendel’s theory of genetic heritage. In this paper, we debate whether pseudo-random processes are needed for evolutionary algorithms or whether deterministic chaos, which is not a...
Preprint
Recent advances in center-based clustering continue to improve upon the drawbacks of Lloyd's celebrated $k$-means algorithm over $60$ years after its introduction. Various methods seek to address poor local minima, sensitivity to outliers, and data that are not well-suited to Euclidean measures of fit, but many are supported largely empirically. Mo...
Preprint
Full-text available
The introduction of Variational Autoencoders (VAE) has been marked as a breakthrough in the history of representation learning models. Besides having several accolades of its own, VAE has successfully flagged off a series of inventions in the form of its immediate successors. Wasserstein Autoencoder (WAE), being an heir to that realm carries with i...
Article
We provide uniform concentration bounds on the kernel k-means clustering objective based on Rademacher complexity by posing the underlying problem as a risk minimisation task. This approach results in state-of-the-art convergence rates on the excess risk besides the eventual establishment of strong consistency of cluster centers.
Article
In this paper, we propose a novel clustering technique that uses the simple idea of creating a graph on the data points based on nearest neighbors and identifying clusters by finding it’s connected components. The algorithm forms the graph based on a border detection and an outlier detection technique. We also propose a novel outlier detection tech...
Article
Constructing adversarial perturbations for deep neural networks is an important direction of research. Crafting image-dependent adversarial perturbations using white-box feedback has hitherto been the norm for such adversarial attacks. However, black-box attacks are much more practical for real-world applications. Universal perturbations applicable...
Article
Generally, Synthetic Benchmark Problems (SBPs) are utilized to assess the performance of metaheuristics. However, these SBPs may include various unrealistic properties. As a consequence, performance assessment may lead to underestimation or overestimation. To address this issue, few benchmark suites containing real-world problems have been proposed...
Chapter
Many of the existing approaches to anomaly detection are based upon supervised learning and heavily dependent on training datasets. However, anomalies rarely occur in most industrial systems. Hence it is challenging to retrieve a training dataset labeled with true anomalies. Therefore, this motivates us to investigate such scenarios where it is ard...
Conference Paper
Full-text available
We propose a novel Evolutionary Algorithm (EA) based on the Differential Evolution algorithm for solving global numerical optimization problem in real-valued continuous parameter space. The proposed MadDE algorithm leverages the power of the multiple adaptation strategy with respect to the control parameters and search mechanisms, and is tested on...
Article
Mixed-Integer Non-Linear Programming (MINLP) is not rare in real-world applications such as portfolio investment. It has brought great challenges to optimization methods due to the complicated search space that has both continuous and discrete variables. This paper considers the multi-objective constrained portfolio optimization problems that can b...
Article
Mean shift is a simple interactive procedure that gradually shifts data points towards the mode which denotes the highest density of data points in the region. Mean shift algorithms have been effectively used for data denoising, mode seeking, and finding the number of clusters in a dataset in an automated fashion. However, the merits of mean shift...
Preprint
Full-text available
The concept of Entropy plays a key role in Information Theory, Statistics, and Machine Learning.This paper introduces a new entropy measure, called the t-entropy, which exploits the concavity of the inverse-tan function. We analytically show that the proposed t-entropy satisfies the prominent axiomatic properties of an entropy measure. We demonstra...
Preprint
Full-text available
Voice Conversion (VC) emerged as a significant domain of research in the field of speech synthesis in recent years due to its emerging application in voice-assisting technology, automated movie dubbing, and speech-to-singing conversion to name a few. VC basically deals with the conversion of vocal style of one speaker to another speaker while keepi...
Article
Full-text available
Power flow (PF) analysis of microgrids (MGs) has been gaining a lot of attention due to the evolution of islanded MGs. To calculate islanded MGs’ PF solution, a globally convergent technique is proposed using Differential Evolution (DE)- a popular optimization algorithm for global non-convex optimization. This paper formulates the PF problem as a c...
Article
Bi-clustering refers to the task of partitioning the rows and columns of a data matrix simultaneously. Although empirically useful, the theoretical aspects of bi-clustering techniques have not been studied in-depth. We present a framework for investigating the statistical guarantees behind the sparse bi-clustering algorithm by using the Vapnik–Cher...
Article
Recently, several numerical algorithms have been proposed to solve the power flow (PF) problems of islanded microgrids (MGs). However, these algorithms approximate the steady-state model of the distributed generation units (DGs) as linear equations. In some cases, this linear approximation leads to inaccurate PF solutions due to dead-zone and gener...
Article
Full-text available
The simulation-driven metaheuristic algorithms have been successful in solving numerous problems compared to their deterministic counterparts. Despite this advantage, the stochastic nature of such algorithms resulted in a spectrum of solutions by a certain number of trials that may lead to the uncertainty of quality solutions. Therefore, it is of u...
Article
To solve the nonconvex constrained optimization problems (COPs) over continuous search spaces by using a population-based optimization algorithm, balancing between the feasible and infeasible solutions in the population plays an important role over different stages of the optimization process. To keep this balance, we propose a constraint handling...
Preprint
Principal Component Analysis (PCA) is a fundamental tool for data visualization, denoising, and dimensionality reduction. It is widely popular in Statistics, Machine Learning, Computer Vision, and related fields. However, PCA is well known to fall prey to the presence of outliers and often fails to detect the true underlying low-dimensional structu...
Article
Clustering with Bregman divergence has been used in literature to unify centroid‐based parametric clustering approaches and to allow detection of non‐spherical clusters within the data. Although empirically useful, the large sample theoretical aspects of Bregman clustering techniques remain largely unexplored. In this paper, we attempt to bridge th...
Book
This book presents the peer-reviewed proceedings of the Sixth International Conference on Intelligent Computing and Applications (ICICA 2020), held at Government College of Engineering, Keonjhar, Odisha, India, during December 22–24, 2020. The book includes the latest research on advanced computational methodologies such as neural networks, fuzzy s...
Article
Multiple kernel clustering methods have been quite successful recently especially concerning the multi-view clustering of complex datasets. These methods simultaneously learn a multiple kernel metric while clustering in an unsupervised setting. With the motivation that some minimal supervision can potentially increase their effectiveness, we propos...
Article
In this paper, we propose a Lasso Weighted k-means ( $LW$ -k-means) algorithm, as a simple yet efficient sparse clustering procedure for high-dimensional data where the number of features ( $p$ ) can be much higher than the number of observations (n). The $LW$ -k-means method imposes an $\ell_1$ regularization term involving the feature weigh...
Preprint
Mean shift is a simple interactive procedure that gradually shifts data points towards the mode which denotes the highest density of data points in the region. Mean shift algorithms have been effectively used for data denoising, mode seeking, and finding the number of clusters in a dataset in an automated fashion. However, the merits of mean shift...
Preprint
Kernel $k$-means clustering is a powerful tool for unsupervised learning of non-linearly separable data. Since the earliest attempts, researchers have noted that such algorithms often become trapped by local minima arising from non-convexity of the underlying objective function. In this paper, we generalize recent results leveraging a general famil...
Preprint
Many of the existing approaches to anomaly detection are based upon supervised learning and heavily dependent on training datasets. However, anomalies rarely occur in most industrial systems. Hence it is challenging to retrieve a training dataset labeled with proper anomalies. Therefore, this motivates us to investigate such scenarios where it is a...
Conference Paper
Full-text available
Activation functions in the neural networks play an important role by introducing non-linear properties to the neural networks. Thus it is considered as one of the essential ingredients among other building blocks of a neural network. But the selection of the appropriate activation function for the enhancement of model accuracy is strenuous in a se...
Article
During the last two decades, the notion of multiobjective optimization (MOO) has been successfully adopted to solve the nonconvex constrained optimization problems (COPs) in their most general forms. However, such works mainly utilized the Pareto dominance-based MOO framework while the other successful MOO frameworks, such as the reference vector (...
Preprint
Indices quantifying the performance of classifiers under class-imbalance, often suffer from distortions depending on the constitution of the test set or the class-specific classification accuracy, creating difficulties in assessing the merit of the classifier. We identify two fundamental conditions that a performance index must satisfy to be respec...
Article
Despite being a well‐known problem, feature weighting and feature selection is a major predicament for clustering. Most of the algorithms, which provide weighting or selection of features, require the number of clusters to be known in advance. On the other hand, the existing automatic clustering procedures that can determine the number of clusters...
Article
Full-text available
In this paper, we propose a ranking method based on a matrix, called D-matrix, with the special identical diagonal values. This ranking system has five properties: (1) it can provide both biased and bias-free ranking, except for that, the working matrix can be built in two ways: results mrging and results separating for both biased and bias-free ma...
Presentation
Full-text available
Activation functions in the neural networks play an important role by introducing non-linear properties to the neural networks. Thus it is considered as one of the essential ingredients among other building blocks of a neural network. But the selection of the appropriate activation function for the enhancement of model accuracy is strenuous in a se...