
Sujan Kumar Roy- Ph. D. (Australia), M. A. Sc. (Canada), M.Sc. (Bangladesh)
- Assistant Teaching Professor at Michigan Technological University
Sujan Kumar Roy
- Ph. D. (Australia), M. A. Sc. (Canada), M.Sc. (Bangladesh)
- Assistant Teaching Professor at Michigan Technological University
Designing Deep Learning Algorithms for Signal Processing, Image Processing, and Computer Vision Applications.
About
39
Publications
26,225
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
184
Citations
Introduction
With over 14 years of experience in academia, I have taught and conducted research at prestigious institutions including Michigan Technological University (USA), University of Rajshahi (Bangladesh), Griffith University (Australia), and Concordia University (Canada). At Michigan Tech, I have extensive expertise in delivering both graduate and undergraduate courses, including Foundations of Data Science, Introduction to Data Science, Computational Intelligence, Data Mining, and Machine Learning.
Skills and Expertise
Current institution
Additional affiliations
January 2022 - August 2024
Position
- Professor (Associate)
Description
- I serve in various roles in the Department of Computer Science and Engineering, focusing on teaching and research in Core Computer Science and AI courses. My expertise includes programming, networks, cybersecurity, operating systems, and AI applications. My research targets Deep Learning and AI in biomedical signal/image processing and speech/audio processing.
March 2013 - present
Publications
Publications (39)
A basic introduction to AI, including machine learning, deep learning basics, domains of AI, and some real-world applications of AI.
The performance of speech coding, speech recognition, and speech enhancement systems that rely on the augmented Kalman filter (AKF) largely depend upon the accuracy of clean speech and noise linear prediction coefficient (LPC) estimation. The formulation of clean speech and noise LPC estimation as a supervised learning task has shown considerable p...
The inaccurate estimates of the speech and noise linear prediction coefficients (LPCs) introduce bias in augmented Kalman filter (AKF) gain, which impacts the quality and intelligibility of enhanced speech. Although current tuning methods offset the bias in AKF gain, particularly in colored noise conditions, they do not adequately address nonstatio...
The minimum mean-square error (MMSE)-based noise PSD estimators have been used widely for speech enhancement. However, the MMSE noise PSD estimators assume that the noise signal changes at a slower rate than the speech signal— which lacks the ability to track the highly non-stationary noise sources. Moreover, the performance of the MMSE-based noise...
Inaccurate estimates of the linear prediction coefficient (LPC) and noise variance introduce bias in Kalman filter (KF) gain and degrade speech enhancement performance. The existing methods propose a tuning of the biased Kalman gain, particularly in stationary noise conditions. This paper introduces a tuning of the KF gain for speech enhancement in...
Speech corrupted by background noise (or noisy speech) can reduce the efficiency of communication between man-man and man-machine. A speech
enhancement algorithm (SEA) can be used to suppress the embedded background noise and increase the quality and intelligibility of noisy speech.
Many applications, such as speech communication systems, hearing a...
The performance of speech coding, speech recognition, and speech enhancement largely depends upon the accuracy of the linear prediction coefficient (LPC) of clean speech and noise in practice. Formulation of speech and noise LPC estimation as a supervised learning problem has shown considerable promise. In its simplest form, a supervised technique,...
Current augmented Kalman filter (AKF)-based speech enhancement algorithms utilise a temporal convolutional network (TCN) to estimate the clean speech and noise linear prediction coefficient (LPC). However, the multi-head attention network (MHANet) has demonstrated the ability to more efficiently model the long-term dependencies of noisy speech than...
The performance of speech coding, speech recognition, and speech enhancement largely depends upon the accuracy of the linear prediction coefficient (LPC) of clean speech and noise in practice. Formulation of speech and noise LPC estimation as a supervised learning problem has shown considerable promise. In its simplest form, a supervised technique,...
Current deep learning approaches to linear prediction coefficient (LPC) estimation for the augmented Kalman filter (AKF) produce bias estimates, due to the use of a whitening filter. This severely degrades the perceived quality and intelligibility of enhanced speech produced by the AKF. In this paper, we propose a deep learning framework that produ...
Current augmented Kalman filter (AKF)-based speech enhancement algorithms utilise a temporal convolutional network (TCN) to estimate the clean speech and noise linear prediction coefficient (LPC). However, the multi-head attention network (MHANet) has demonstrated the ability to more efficiently model the long-term dependencies of noisy speech than...
Current augmented Kalman filter (AKF)-based speech enhancement algorithms utilise a temporal convolutional network (TCN) to estimate the clean speech and noise linear prediction coefficient (LPC). However, the multi-head attention network (MHANet) has demonstrated the ability to more efficiently model the long-term dependencies of noisy speech than...
Current deep learning approaches to linear prediction coefficient (LPC) estimation for the augmented Kalman filter (AKF) produce bias estimates, due to the use of a whitening filter. This severely degrades the perceived quality and intelligibility of enhanced speech produced by the AKF. In this paper, we propose a deep learning framework that produ...
Current deep learning approaches to linear prediction coefficient (LPC) estimation for the augmented Kalman filter (AKF) produce bias estimates, due to the use of a whitening filter. This severely degrades the perceived quality and intelligibility of enhanced speech produced by the AKF. In this paper, we propose a deep learning framework that produ...
The inaccurate estimates of linear prediction coefficient (LPC) and noise variance introduce bias in Kalman filter (KF) gain and degrades speech enhancement performance. The existing methods proposed a tuning of the biased Kalman gain particularly in stationary noise condition. This paper introduces a tuning of the KF gain for speech enhancement in...
Speech enhancement using augmented Kalman filter (AKF) suffers from the inaccurate estimates of the key parameters, linear prediction coefficients (LPCs) of speech and noise signal in noisy conditions. The existing AKF particularly enhances speech in colored noise conditions. In this paper, a deep residual network (ResNet)-based method utilizes the...
Speech enhancement using Kalman filter (KF) suffers from inaccurate estimates of the noise variance and the linear prediction coefficients (LPCs) in real-life noise conditions. This causes a degraded speech enhancement performance. In this paper, a causal convolutional neural network (CCNN) model is used to more accurately estimate the noise varian...
The existing Kalman filter (KF) suffers from poor estimates of the noise variance and the linear prediction coefficients (LPCs) in real-world noise conditions. This results in a degraded speech enhancement performance. In this paper, a deep learning approach is used to more accurately estimate the noise variance and LPCs, enabling the KF to enhance...
The existing augmented Kalman filter (AKF) suffers from poor LPC estimates in real-world noise conditions, which degrades the speech enhancement performance. In this paper, a deep learning technique exploits the LPC estimates for the AKF to enhance speech in various noise conditions. Specifically, a deep residual network is used to estimate the noi...
Signal analysis in acoustic domain, transform domain, filter-bank.
This paper presents an iterative Kalman filter (IT-KF) with a reduced-biased Kalman gain for single channel speech enhancement in Non-stationary Noise Conditions (NNCs). The proposed IT-KF aims to offset the bias in Kalman gain through efficient parameter estimation leading to improve the speech enhancement performance. To do this, we introduce a D...
This paper presents a non-iterative Kalman fil-
ter (NIT-KF) for single channel speech enhancement in non-
stationary noise condition (NNC). To adopt NIT-KF with NNC,
we address the adjustment of biased Kalman gain through
efficient parameter estimation. We introduce an effective noise
spectrum tracking method based on decision directed approach
(D...
The quality and intelligibility of speech conversation are generally degraded by the
surrounding noises. The main objective of speech enhancement (SE) is to eliminate
or reduce such disturbing noises from the degraded speech. Various SE methods have
been proposed in literature. Among them, the Kalman filter (KF) is known to be an
efficient SE metho...
This paper presents an efficient pitch estimation algorithm for noisy speech signal using ensemble empirical mode decomposition (EEMD) based time domain filtering. The dominant harmonic of noisy speech is enhanced to make pitch period more prominent. The normalized autocorrelation function (NACF) of the modified signal is then decomposed into time...
This paper presents an efficient pitch estimation
algorithm (PEA) using dominant harmonic modification (DHM)
and ensemble empirical mode decomposition (EEMD). The
noisy speech is first low-pass filtered within the ranges of
fundamental frequencies (50-500Hz) to obtain the pre-filtered
signal (PFS). The pre-processed signal is then modified by...
A novel and robust pitch estimation method is presented in this paper. The basic idea is to reshape the speech signal using
a combination of the dominant harmonic modification (DHM) and data adaptive time domain filtering techniques. The noisy speech
signal is filtered within the ranges of fundamental frequencies to obtain the pre-filtered signal (...
This paper presents an efficient pitch estimation algorithm of noisy speech signal using the combination of dominant harmonic modification (DHM) and data adaptive time domain filtering approach. The noisy speech signal is pre- filtered within the range of fundamental frequency. The dominant harmonic (DH) is determined in pre-filtered signal and enh...
This paper presents a robust voiced/unvoiced classification method by using linear model of empirical mode decomposition (EMD) controlled by Hurst exponent. EMD decomposes any signals into a finite number of band limited signals called intrinsic mode functions (IMFs). It is assumed that voiced speech signal is composed of trend due to vocal cord vi...
This paper focuses on a pitch estimation method of noisy speech signal using the combination of empirical mode decomposition (EMD) and discrete Fourier transform (DFT). The noisy speech signal is filtered within the range of fundamental frequency. Normalized autocorrelation function (NACF) is computed from the pre-filtered noisy speech signal. The...
The design and implementation of anefficient bangla interfaced search engine usingNatural Language Processing (NLP) has bee npresented in this paper. The search engine providesdynamic bangla interface so that general people ofbangla speaking c an use i t more f requently andeasily. A user of the se arch e ngine c an i nputbangla sentence w ithout a...