Harishchandra Dubey

Harishchandra Dubey
Microsoft · Microsoft Research

PhD
Machine Learning for Audio

About

121
Publications
48,813
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,921
Citations
Citations since 2017
81 Research Items
2806 Citations
20172018201920202021202220230100200300400500600
20172018201920202021202220230100200300400500600
20172018201920202021202220230100200300400500600
20172018201920202021202220230100200300400500600
Introduction
I am interested in Audio, Speech and Language Processing, Machine Learning, Fog & Cloud Computing, Social Signal Processing. I received PhD in Electrical Engineering from the Center for Robust Speech Systems at University of Texas at Dallas, USA. Master of Science, FAU University of Erlangen-Nuremberg, Germany in 2015. I received Bachelor of Technology in Electronics and Communication Engineering from Motilal Nehru National Institute of Technology, Allahabad, India in 2012.
Additional affiliations
May 2015 - December 2015
University of Rhode Island
Position
  • Researcher
October 2013 - April 2015
Friedrich-Alexander-University of Erlangen-Nürnberg
Position
  • Master of Science
Description
  • Prof. Dr.-Ing. Walter Kellermann
July 2008 - July 2012
Motilal Nehru National Institute of Technology
Position
  • Bachelor of Technology

Publications

Publications (121)
Preprint
Full-text available
Deep Speech Enhancement Challenge is the 5th edition of deep noise suppression (DNS) challenges organized at ICASSP 2023 Signal Processing Grand Challenges. DNS challenges were organized during 2019-2023 to stimulate research in deep speech enhancement (DSE). Previous DNS challenges were organized at INTERSPEECH 2020, ICASSP 2021, INTERSPEECH 2021,...
Preprint
Full-text available
The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. This is the 4th DNS challenge, with the previous ones held at INTERSPEECH 2020, ICASSP 2021, and IN-TERSPEECH 2021. We open-source training and test datasets for researchers to train their deep...
Article
Full-text available
Internet users are increasing day by day due to its support for many applications and creation of innovative services. Along with this, energy consumption is also becoming an important concern in networking. Several researchers have investigated energy saving schemes for networks. Software Defined Networking (SDN) is an excellent choice which impro...
Preprint
Full-text available
With the recent growth of remote and hybrid work, online meetings often encounter challenging audio contexts such as background noise, music, and echo. Accurate real-time detection of music events can help to improve the user experience in such scenarios, e.g., by switching to high-fidelity music-specific codec or selecting the optimal noise suppre...
Preprint
Full-text available
The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH and ICASSP 2020. We open-sourced training and test datasets for the wideband scenario. We also open-sourced a subjective eval...
Preprint
Full-text available
The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH 2020. We open sourced training and test datasets for researchers to train their noise suppression models. We also open source...
Chapter
An effective credit card fraud detection model is the most challenging issue for the financial organizations. Statistical and machine learning (ML) techniques are widely explored in financial applications. But there is no thumb rule which technique gives better performance. Recent studies conclude that ensemble learning may be the right approach in...
Article
Full-text available
Cloud computing is one of the most tempting technologies in today's computing scenario as it provides a cost‐efficient solutions by reducing the large upfront cost for buying hardware infrastructures and computing power. Fog computing is an added support to cloud environment by leveraging with doing some of the less compute intensive task to be don...
Preprint
The INTERSPEECH 2020 Deep Noise Suppression (DNS) Challenge is intended to promote collaborative research in real-time single-channel Speech Enhancement aimed to maximize the subjective (perceptual) quality of the enhanced speech. A typical approach to evaluate the noise suppression methods is to use objective metrics on the test set obtained by sp...
Article
Full-text available
Several real-world applications involve the aggregation of physical features corresponding to different geographic and topographic phenomena. This information plays a crucial role in analyzing and predicting several events. The application areas, which often require a real-time analysis, include traffic flow, forest cover, disease monitoring and so...
Article
Full-text available
This article describes how machine learning (ML) algorithms are very useful for analysis of data and finding some meaningful information out of them, which could be used in various other applications. In the last few years, an explosive growth has been seen in the dimension and structure of data. There are several difficulties faced by conventional...
Preprint
Full-text available
This paper investigates several aspects of training a RNN (recurrent neural network) that impact the objective and subjective quality of enhanced speech for real-time single-channel speech enhancement. Specifically, we focus on a RNN that enhances short-time speech spectra on a single-frame-in, single-frame-out basis, a framework adopted by most cl...
Preprint
Full-text available
The INTERSPEECH 2020 Deep Noise Suppression Challenge is intended to promote collaborative research in real-time single-channel Speech Enhancement aimed to maximize the subjective (perceptual) quality of the enhanced speech. A typical approach to evaluate the noise suppression methods is to use objective metrics on the test set obtained by splittin...
Preprint
Full-text available
Audio event classification is an important task for several applications such as surveillance, audio, video and multimedia retrieval etc. There are approximately 3M people with hearing loss who can't perceive events happening around them. This paper establishes the CURE dataset which contains curated set of specific audio events most relevant for p...
Chapter
Full-text available
This article describes how machine learning (ML) algorithms are very useful for analysis of data and finding some meaningful information out of them, which could be used in various other applications. In the last few years, an explosive growth has been seen in the dimension and structure of data. There are several difficulties faced by conventional...
Article
Full-text available
Geospatial data analysis using cloud computing platform is one of the promising areas for analysing, retrieving, and processing volumetric data. Fog computing paradigm assists cloud platform where fog devices try to increase the throughput and reduce latency at the edge of the client. In this research paper, the authors discuss two case studies on...
Chapter
Full-text available
This article describes how machine learning (ML) algorithms are very useful for analysis of data and finding some meaningful information out of them, which could be used in various other applications. In the last few years, an explosive growth has been seen in the dimension and structure of data. There are several difficulties faced by conventional...
Article
Full-text available
Audio event classification is an important task for several applications such as surveillance, audio, video and multimedia retrieval etc. There are approximately 3M people with hearing loss who can't perceive events happening around them. This paper establishes the CURE dataset which contains cu-rated set of specific audio events most relevant for...
Preprint
Full-text available
Linear and non-linear measures of heart rate variability (HRV) are widely investigated as non-invasive indicators of health. Stress has a profound impact on heart rate, and different meditation techniques have been found to modulate heartbeat rhythm. This paper aims to explore the process of identifying appropriate metrices from HRV analysis for so...
Conference Paper
Full-text available
Linear and non-linear measures of heart rate variability (HRV) are widely investigated as non-invasive indicators of health. Stress has a profound impact on heart rate, and different meditation techniques have been found to modulate heartbeat rhythm. This paper aims to explore the process of identifying appropriate metrices from HRV analysis for so...
Preprint
Full-text available
Speaker diarization determines who spoke and when? in an audio stream. In this study, we propose a model-based approach for robust speaker clustering using i-vectors. The ivectors extracted from different segments of same speaker are correlated. We model this correlation with a Markov Random Field (MRF) network. Leveraging the advancements in MRF m...
Article
Full-text available
In today’s digital world healthcare is one core area of the medical domain. A healthcare system is required to analyze a large amount of patient data which helps to derive insights and assist the prediction of diseases. This system should be intelligent in order to predict a health condition by analyzing a patient’s lifestyle, physical health recor...
Book
This book introduces the latest research findings in cloud, edge, fog, and mist computing and their applications in various fields using geospatial data. It solves a number of problems of cloud computing and big data, such as scheduling, security issues using different techniques, which researchers from industry and academia have been attempting to...
Article
Full-text available
Spatial Data Infrastructure (SDI) is an important framework for sharing geospatial big data using the web. Integration of SDI with cloud computing led to emergence of Cloud-SDI as a tool for transmission, processing and analysis of geospatial data. Fog computing is a paradigm where embedded computers are employed to increase the throughput and redu...
Chapter
In the digital planet, the concept of spatial data, its cloud and Geographical Indications (GI) plays a crucial role for mapping any organization or point and acquired a reputation for producing quality results based on their spatial characteristics, including their visualization. From the twentieth century onwards, the GIS were also developed to c...
Article
Full-text available
The cloud and fog computing paradigms are developing area for storing, processing, and analysis of geospatial big data. Latest trend is mist computing which boost fog and cloud concepts for computing process where edge devices are used to help increase throughput and reduce latency to support at client edge. The present research article discussed t...
Chapter
Big data analytics with the cloud computing are one of the emerging area for processing and analytics. Fog computing is the paradigm where fog devices help to reduce latency and increase throughput for assisting at the edge of the client. This article discusses the emergence of fog computing for mining analytics in big data from geospatial and medi...
Chapter
Full-text available
This chapter proposes and develops a cloud-computing-based SDI model named as TCloud for sharing, analysis, and processing of spatial data particularly in the Temple City of India, Bhubaneswar. The main purpose of TCloud is to integrate all the spatial information such as tourism sites which include various temples, mosques, churches, monuments, la...
Article
Full-text available
The use of wearable and Internet-of-Things (IoT) for smart and affordable healthcare is trending. In traditional setups, the cloud backend receives the healthcare data and performs monitoring and prediction for diseases, diagnosis, and wellness prediction. Fog computing (FC) is a distributed computing paradigm that leverages low-power embedded proc...
Article
Full-text available
Multi-layer noise refers to scenarios where multiple distinct noise sources are simultaneously active in an audio stream. We collected a corpus named the CRSS long-duration naturalistic noise (CRSS-LDNN) corpus. It contains noise captured from complex daily-life activities using wearable LENA units. The diversity in noise-sources include constructi...
Preprint
Full-text available
Speaker Diarization (i.e. determining who spoke and when?) for multi-speaker naturalistic interactions such as Peer-Led Team Learning (PLTL) sessions is a challenging task. In this study, we propose robust speaker clustering based on mixture of multivariate von Mises-Fisher distributions. Our diarization pipeline has two stages: (i) ground-truth se...
Preprint
Full-text available
Wrist-bands such as smartwatches have become an unobtrusive interface for collecting physiological and contextual data from users. Smartwatches are being used for smart healthcare, telecare, and wellness monitoring. In this paper, we used data collected from the AnEAR framework leveraging smartwatches to gather and store physiological data from pat...
Preprint
Full-text available
In certain applications such as zero-resource speech processing or very-low resource speech-language systems, it might not be feasible to collect speech activity detection (SAD) annotations. However, the state-of-the-art supervised SAD techniques based on neural networks or other machine learning methods require annotated training data matched to t...
Article
Full-text available
Thisarticledescribeshowmachinelearning(ML)algorithmsareveryusefulforanalysisofdataand findingsomemeaningfulinformationoutofthem,whichcouldbeusedinvariousotherapplications. Inthelastfewyears,anexplosivegrowthhasbeenseeninthedimensionandstructureofdata. Thereareseveraldifficultiesfacedbyconventio...
Chapter
Full-text available
This chapter proposes and develops a cloud-computing-based SDI model named as TCloud for sharing, analysis, and processing of spatial data particularly in the Temple City of India, Bhubaneswar. The main purpose of TCloud is to integrate all the spatial information such as tourism sites which include various temples, mosques, churches, monuments, la...
Article
Full-text available
Robust speech processing for single-stream audio data has achieved significant progress in the last decade. However, multi-stream speech processing poses new challenges not present in single-stream data. The peer-led team learning (PLTL) is a teaching paradigm popular among US universities for undergraduate education in STEM courses. In collaborati...
Chapter
This book chapter discusses the concept of edge-assisted cloud computing and its relation to the emerging domain of “Fog-of-things (FoT)”. Such systems employ low-power embedded computers to provide local computation close to clients or cloud. The discussed architectures cover applications in medical, healthcare, wellness and fitness monitoring, ge...
Article
Full-text available
Wearable photoplethysmography has recently become a common technology in heart rate (HR) monitoring. General observation is that the motion artifacts change the statistics of the acquired PPG signal. Consequently, estimation of HR from such a corrupted PPG signal is challenging. However, if an accelerometer is also used to acquire the acceleration...
Article
Full-text available
The smart health paradigms employ Internet-connected wearables for tele-monitoring, diagnosis providing inexpensive healthcare solutions. Mist computing reduces latency and increases throughput by processing data near the edge of the network. In the present paper, we proposed a secure mist Computing architecture that is validated on recently releas...
Article
Full-text available
Big data analytics with the cloud computing are one of the emerging area for processing and analytics. Fog computing is the paradigm where fog devices help to reduce latency and increase throughput for assisting at the edge of the client. This article discusses the emergence of fog computing for mining analytics in big data from geospatial and medi...
Article
Full-text available
The present manuscript concentrates on the application of Fog computing to a Smart Grid Network that comprises of a Distribution Generation System known as a Microgrid. It addresses features and advantages of a smart grid. Two computational methods for on-demand processing based on shared information resources is discussed. Fog Computing acts as an...
Chapter
Full-text available
Geospatial data analysis with the help of cloud and fog computing are one of the emerging area for processing, storing and analysis of geospatial data. Mist computing is also one of the paradigm where fog devices help to reduce the latency period and increase throughput for assisting at the near of edge device of the client. It discusses the emerge...
Article
The smart health paradigms employ Internet-connected wearables for tele-monitoring, diagnosis providing inexpensive healthcare solutions. Mist computing reduces latency and increases throughput by processing data near the edge of the network. In the present paper, we proposed a secure mist Computing architecture that is validated on recently releas...
Chapter
Big data analytics with the help of cloud computing are one of the emerging area for processing and analytics in health care system. Mist computing is one of the paradigm where edge devices assist the fog node help to reduce latency and increase throughput for assisting at the edge of the client. This paper discusses the emergence of mist computing...
Chapter
Full-text available
The enormous growth in communication technology is re- sulting in an excessively connected network where billions of connected devices produce massive data flow. The fifth generation mobile technology will need a major paradigm shift so as to fulfill the increasing demand for reliable, ubiquitous connectivity, higher bandwidth, lower latencies, and...
Article
Full-text available
The increasing use of wearables in smart telehealth system led to generation of huge medical big data. Cloud and fog services leverage these data for assisting clinical procedures.IoT Healthcare has been benefited from this large pool of generated data. This paper suggests the use of low-resource machine learning on Fog devices kept close to wearab...
Article
Full-text available
The smart health paradigms employ Internet-connected wearables for telemonitoring, diagnosis for providing inexpensive healthcare solutions. Fog computing reduces latency and increases throughput by processing data near the body sensor network. In this paper, we proposed a secure serviceorientated edge computing architecture that is validated on re...
Article
Full-text available
Abstract—In this paper, we present the design of a wearable photoplethysmography (PPG) system, R-band for acquiring the PPG signals. PPG signals are influenced by the respiration or breathing process and hence can be used for estimation of respiration rate. R-Band detects the PPG signal that is routed to a Bluetooth low energy device such as a near...
Article
Abstract—This study considers the situation where computa- tional loads are transferred to edge devices and single edge device is not enough. The co-operative sensing, analysis and transmission between several edge nodes helps in enhancing scalability in Fog computing frameworks. The present study uses the positive case of malaria vector borne dis...