Saket Anand

Saket Anand
Indraprastha Institute of Information Technology | IIITD · Department of Computer Science

Doctor of Philosophy

About

66
Publications
10,433
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
683
Citations
Citations since 2017
56 Research Items
595 Citations
2017201820192020202120222023050100150
2017201820192020202120222023050100150
2017201820192020202120222023050100150
2017201820192020202120222023050100150
Introduction
Saket Anand currently works at the Department of Computer Science, Indraprastha Institute of Information Technology. Saket does research in Computer Vision, Machine Learning and Artificial Intelligence. His interests lie at the intersection of Semi-Supervised Learning, Geometry and Robust Statistics.

Publications

Publications (66)
Chapter
Full-text available
Label hierarchies are often available apriori as part of biological taxonomy or language datasets WordNet. Several works exploit these to learn hierarchy aware features in order to improve the classifier to make semantically meaningful mistakes while maintaining or reducing the overall error. In this paper, we propose a novel approach for learning...
Conference Paper
Full-text available
In Active Domain Adaptation (ADA), one uses Active Learning (AL) to select a subset of images from the target domain, which are then annotated and used for supervised domain adaptation (DA). Given the large performance gap between supervised and unsupervised DA techniques, ADA allows for an excellent trade-off between annotation cost and performanc...
Preprint
Full-text available
In Active Domain Adaptation (ADA), one uses Active Learning (AL) to select a subset of images from the target domain, which are then annotated and used for supervised domain adaptation (DA). Given the large performance gap between supervised and unsupervised DA techniques, ADA allows for an excellent trade-off between annotation cost and performanc...
Preprint
Full-text available
The availability of well-curated datasets has driven the success of Machine Learning (ML) models. Despite the increased access to earth observation data for agriculture, there is a scarcity of curated, labelled datasets, which limits the potential of its use in training ML models for remote sensing (RS) in agriculture. To this end, we introduce a f...
Preprint
Full-text available
Label hierarchies are often available apriori as part of biological taxonomy or language datasets WordNet. Several works exploit these to learn hierarchy aware features in order to improve the classifier to make semantically meaningful mistakes while maintaining or reducing the overall error. In this paper, we propose a novel approach for learning...
Preprint
Full-text available
The integration of the modern Machine Learning (ML) models into remote sensing and agriculture has expanded the scope of the application of satellite images in the agriculture domain. In this paper, we present how the accuracy of crop type identification improves as we move from medium-spatiotemporal-resolution (MSTR) to high-spatiotemporal-resolut...
Preprint
Cyber Physical Systems (CPS) applications have agents that actuate in their local vicinity, while requiring measurements that capture the state of their larger environment to make actuation choices. These measurements are made by sensors and communicated over a network as update packets. Network resource constraints dictate that updates arrive at a...
Preprint
Full-text available
Our objective in this study was to determine if machine learning (ML) can automatically recognize neonatal manipulations, along with associated changes in physiological parameters. A retrospective observational study was carried out in two Neonatal Intensive Care Units (NICUs) between December 2019 to April 2020. Both the video and physiological da...
Preprint
Full-text available
Semi-supervised learning approaches have emerged as an active area of research to combat the challenge of obtaining large amounts of annotated data. Towards the goal of improving the performance of semi-supervised learning methods, we propose a novel framework, HIERMATCH, a semi-supervised approach that leverages hierarchical information to reduce...
Preprint
Full-text available
Contextual information is a valuable cue for Deep Neural Networks (DNNs) to learn better representations and improve accuracy. However, co-occurrence bias in the training dataset may hamper a DNN model's generalizability to unseen scenarios in the real world. For example, in COCO, many object categories have a much higher co-occurrence with men com...
Article
Code clones are duplicate code fragments that share (nearly) similar syntax or semantics. Code clone detection plays an important role in software maintenance, code refactoring, and reuse. A substantial amount of research has been conducted in the past to detect clones. A majority of these approaches use lexical and syntactic information to detect...
Article
Surveillance camera networks are a useful monitoring infrastructure that can be used for various visual analytics applications, where high-level inferences and predictions could be made based on target tracking across the network. Most multi-camera tracking works focus on re-identification problems and trajectory association problems. However, as c...
Article
Full-text available
Objectives The objectives of this study are to construct the high definition phenotype (HDP), a novel time-series data structure composed of both primary and derived parameters, using heterogeneous clinical sources and to determine whether different predictive models can utilize the HDP in the neonatal intensive care unit (NICU) to improve neonatal...
Chapter
Full-text available
Classical monocular Simultaneous Localization And Mapping (SLAM) and the recently emerging convolutional neural networks (CNNs) for monocular depth prediction represent two largely disjoint approaches towards building a 3D map of the surrounding environment. In this paper, we demonstrate that the coupling of these two by leveraging the strengths of...
Preprint
Full-text available
Code clones are duplicate code fragments that share (nearly) similar syntax or semantics. Code clone detection plays an important role in software maintenance, code refactoring, and reuse. A substantial amount of research has been conducted in the past to detect clones. A majority of these approaches use lexical and syntactic information to detect...
Article
Surveillance camera networks are a useful infrastructure for various visual analytics applications, where high-level inferences and predictions could be made based on target tracking across the network. Most multi-camera tracking works focus on target re-identification and trajectory association problems to track the target. However, since camera n...
Chapter
Requirement of large annotated datasets restrict the use of deep convolutional neural networks (CNNs) for many practical applications. The problem can be mitigated by using active learning (AL) techniques which, under a given annotation budget, allow to select a subset of data that yields maximum accuracy upon fine tuning. State of the art AL appro...
Preprint
Full-text available
Requirement of large annotated datasets restrict the use of deep convolutional neural networks (CNNs) for many practical applications. The problem can be mitigated by using active learning (AL) techniques which, under a given annotation budget, allow to select a subset of data that yields maximum accuracy upon fine tuning. State of the art AL appro...
Preprint
Full-text available
Deep Neural Networks (DNNs) are often criticized for being susceptible to adversarial attacks. Most successful defense strategies adopt adversarial training or random input transformations that typically require retraining or fine-tuning the model to achieve reasonable performance. In this work, our investigations of intermediate representations of...
Preprint
Full-text available
Robust multiple model fitting plays a crucial role in many computer vision applications. Unlike single model fitting problems, the multi-model fitting has additional challenges. The unknown number of models and the inlier noise scale are the two most important of them, which are in general provided by the user using ground-truth or some other auxil...
Preprint
Visual animal biometrics is rapidly gaining popularity as it enables a non-invasive and cost-effective approach for wildlife monitoring applications. Widespread usage of camera traps has led to large volumes of collected images, making manual processing of visual content hard to manage. In this work, we develop a framework for automatic detection a...
Preprint
Input transformation based defense strategies fall short in defending against strong adversarial attacks. Some successful defenses adopt approaches that either increase the randomness within the applied transformations, or make the defense computationally intensive, making it substantially more challenging for the attacker. However, it limits the a...
Preprint
Full-text available
Classical monocular Simultaneous Localization And Mapping (SLAM) and the recently emerging convolutional neural networks (CNNs) for monocular depth prediction represent two largely disjoint approaches towards building a 3D map of the surrounding environment. In this paper, we demonstrate that the coupling of these two by leveraging the strengths of...
Preprint
Full-text available
Surveillance camera networks are a useful infrastructure for various visual analytics applications, where high-level inferences and predictions could be made based on target tracking across the network. Most multi-camera tracking works focus on target re-identification and trajectory association problems to track the target. However, since camera n...
Article
Convolutional Neural Networks (CNNs) are popular models that have been successfully applied to diverse domains like vision, speech, and text. To reduce inference-time latency, it is common to employ hardware accelerators, which often require a model compression step. Contrary to most compression algorithms that are agnostic of the underlying hardwa...
Chapter
Disentangling underlying feature attributes within an image with no prior supervision is a challenging task. Models that can disentangle attributes well, provide greater interpretability and control. In this paper, we propose a self-supervised framework: DisCont to disentangle multiple attributes by exploiting structural inductive biases within ima...
Conference Paper
Visual data analytics is increasingly becoming an important part of wildlife monitoring and conservation strategies. In this work, we discuss our solution to the image-based Amur tiger re-identification (Re-ID) challenge hosted by the CVWC Workshop at ICCV 2019. Various factors like poor quality images, lighting and pose variations, and limited ima...
Chapter
Full-text available
Ecological imbalance owing to rapid urbanization and deforestation has adversely affected the population of several wild animals. This loss of habitat has skewed the population of several non-human primate species like chimpanzees and macaques and has constrained them to co-exist in close proximity of human settlements, often leading to human-wildl...
Preprint
Full-text available
Learning representations that can disentangle explanatory attributes underlying the data improves interpretabilty as well as provides control on data generation. Various learning frameworks such as VAEs, GANs and auto-encoders have been used in the literature to learn such representations. Most often, the latent space is constrained to a partitione...
Preprint
Full-text available
Ecological imbalance owing to rapid urbanization and deforestation has adversely affected the population of several wild animals. This loss of habitat has skewed the population of several non-human primate species like chimpanzees and macaques and has constrained them to co-exist in close proximity of human settlements, often leading to human-wildl...
Conference Paper
Surveillance camera networks are a useful monitoring infrastructure that can be used for various visual analytics applications , where high-level inferences and predictions could be made based on target tracking across the network. Most multi-camera tracking works focus on re-identification problems and trajectory association problems. However, as...
Preprint
Full-text available
Deep generative models like variational autoencoders approximate the intrinsic geometry of high dimensional data manifolds by learning low-dimensional latent-space variables and an embedding function. The geometric properties of these latent spaces has been studied under the lens of Riemannian geometry; via analysis of the non-linearity of the gene...
Chapter
Deforestation and loss of habitat have resulted in rapid decline of certain species of primates in forests. On the other hand, uncontrolled growth of a few species of primates in urban areas has led to safety issues and nuisance for the local residents. Hence, identifying individual primates has become the need of the hour - not only for conservati...
Preprint
Full-text available
Despite loss of natural habitat due to development and urbanization, certain species like the Rhesus macaque have adapted well to the urban environment. With abundant food and no predators, macaque populations have increased substantially in urban areas, leading to frequent conflicts with humans. Overpopulated areas often witness macaques raiding c...
Chapter
Generative models that learn disentangled representations for different factors of variation in an image can be very useful for targeted data augmentation. By sampling from the disentangled latent subspace of interest, we can efficiently generate new data necessary for a particular task. Learning disentangled representations is a challenging proble...
Preprint
Full-text available
Target tracking in a camera network is an important task for surveillance and scene understanding. The task is challenging due to disjoint views and illumination variation in different cameras. In this direction, many graph-based methods were proposed using appearance-based features. However, the appearance information fades with high illumination...
Preprint
Full-text available
Our premise is that autonomous vehicles must optimize communications and motion planning jointly. Specifically, a vehicle must adapt its motion plan staying cognizant of communications rate related constraints and adapt the use of communications while being cognizant of motion planning related restrictions that may be imposed by the on-road environ...
Preprint
Clustering using neural networks has recently demonstrated promising performance in machine learning and computer vision applications. However, the performance of current approaches is limited either by unsupervised learning or their dependence on large set of labeled data samples. In this paper, we propose ClusterNet that uses pair-wise semantic c...
Preprint
Full-text available
Clustering using neural networks has recently demon- strated promising performance in machine learning and computer vision applications. However, the performance of current approaches is limited either by unsupervised learn- ing or their dependence on large set of labeled data sam- ples. In this paper, we propose ClusterNet that uses pair- wise sem...
Preprint
Full-text available
Recent advances in neural network based acoustic modelling have shown significant improvements in automatic speech recognition (ASR) performance. In order for acoustic models to be able to handle large acoustic variability, large amounts of labeled data is necessary, which are often expensive to obtain. This paper explores the application of advers...
Preprint
Full-text available
Generative models that learn disentangled representations for different factors of variation in an image can be very useful for targeted data augmentation. By sampling from the disentangled latent subspace of interest, we can efficiently generate new data necessary for a particular task. Learning disentangled representations is a challenging proble...
Chapter
Visual animal biometrics is rapidly gaining popularity as it enables a non-invasive and cost-effective approach for wildlife monitoring applications. Widespread usage of camera traps has led to large volumes of collected images, making manual processing of visual content hard to manage. In this work, we develop a framework for automatic detection a...
Conference Paper
Full-text available
Semi-Global Matching (SGM) is a popular algorithm to calculate depth maps in stereo images offering the best trade-off among accuracy, computational costs and high frame rates. This paper presents two architectural improvements in FPGA implementations of SGM to achieve high frame rates. First, a highly parallel, pipelined and scalable architecture...
Conference Paper
Robust multi-model fitting problems are often solved using consensus based or preference based methods, each of which captures largely independent information from the data. However, most existing techniques still adhere to either of these approaches. In this paper, we bring these two paradigms together and present a novel robust method for discove...
Conference Paper
In this work we propose an l p -norm data fidelity constraint for training the autoencoder. Usually the Euclidean distance is used for this purpose; we generalize the l 2 -norm to the l p -norm; smaller values of p make the problem robust to outliers. The ensuing optimization problem is solved using the Augmented Lagrangian approach. The proposed l...
Chapter
Real-world visual data are often corrupted and require the use of estimation techniques that are robust to noise and outliers. Robust methods are well studied for Euclidean spaces and their use has also been extended to Riemannian spaces. In this chapter, we present the necessary mathematical constructs for Grassmann manifolds, followed by two diff...
Article
Full-text available
Mean shift clustering is a powerful nonparametric technique that does not require prior knowledge of the number of clusters and does not constrain the shape of the clusters. However, being completely unsupervised, its performance suffers when the original distance metric fails to capture the underlying cluster structure. Despite recent advances in...
Article
Full-text available
We propose a novel robust estimation algorithm - the generalized projection based M-estimator (gpbM), which does not require the user to specify any scale parameters. The algorithm is general and can handle heteroscedastic data with multiple linear constraints for single and multi-carrier problems. The gpbM has three distinct stages -- scale estima...
Conference Paper
Full-text available
We introduce a robust estimator called generalized projection based M-estimator (gpbM) which does not require the user to specify any scale parameters. For multiple inlier structures, with different noise covariances, the estimator iteratively determines one inlier structure at a time. Unlike pbM, where the scale of the inlier noise is estimated si...
Conference Paper
Full-text available
Following work of Stroud and Saeger, we investigate the formulation of the port of entry inspection algorithm problem as a problem of finding an op- timal binary decision tree for an appropriate Boolean decision function. We re- port on an experimental analysis of the robustness of the conclusions of the Stroud-Saeger analysis and show that the opt...
Article
A new face recognition algorithm is proposed which is robust to variations in pose, expression and illumination. The framework is similar to the ubiquitous block matching algorithm used for motion estimation in video compression but has been adapted to compensate for illumination differences. One of the key differentiators of this approach is that...

Network

Cited By