Zhuohuang ZhangShenzhen Polytechnic University
Zhuohuang Zhang
Doctor of Philosophy
About
28
Publications
4,191
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
388
Citations
Introduction
Zhuohuang Zhang currently works at Shenzhen Polytechnic University as a lecturer. Zhuohuang does research in artificial intelligence, speech enhancement/separation and human speech perception.
Skills and Expertise
Additional affiliations
August 2022 - June 2023
Education
August 2017 - May 2022
August 2015 - May 2017
August 2011 - June 2015
Publications
Publications (28)
Phase serves as a critical component of speech that influences the quality and intelligibility. Current speech enhancement algorithms are beginning to address phase distortions, but the algorithms focus on normal-hearing (NH) listeners. It is not clear whether phase enhancement is beneficial for hearing-impaired (HI) listeners. We investigated the...
Recent work has shown that it is feasible to use generative ad-versarial networks (GANs) for speech enhancement, however, these approaches have not been compared to state-of-the-art (SOTA) non GAN-based approaches. Additionally, many loss functions have been proposed for GAN-based approaches, but they have not been adequately compared. In this stud...
Speech separation algorithms are often used to separate the target speech from other interfering sources. However, purely neural network based speech separation systems often cause nonlinear distortion that is harmful for automatic speech recognition (ASR) systems. The conventional mask-based minimum variance distortionless response (MVDR) beamform...
Recently we proposed an all-deep-learning minimum variance distortionless response (ADL-MVDR) method where the unstable matrix inverse and principal component analysis (PCA) operations in the MVDR were replaced by recurrent neural networks (RNNs). However, it is not clear whether the success of the ADL-MVDR is owed to the calculated covariance matr...
Many purely neural network based speech separation approaches have been proposed to improve objective assessment scores, but they often introduce nonlinear distortions that are harmful to modern automatic speech recognition (ASR) systems. Minimum variance distortionless response (MVDR) filters are often adopted to remove nonlinear distortions, howe...
With the advances in speech communication systems such as online conferencing applications, we can seamlessly work with people regardless of where they are. However, during online meetings, speech quality can be significantly affected by background noise, reverberation, packet loss, network jitter, etc. Because of its nature, speech quality is trad...
Personal hearing devices, such as hearing aids, may be fine-tuned by allowing the users to conduct self-adjustment. Two self-adjustment procedures were developed to collect the listener preferred gains in six octave-frequency bands from 0.25 kHz to 8 kHz. These procedures were designed to allow rapid exploration of a multi-dimensional parameter spa...
Continuous speech separation (CSS) aims to separate overlapping voices from a continuous influx of conversational audio containing an unknown number of utterances spoken by an unknown number of speakers. A common application scenario is transcribing a meeting conversation recorded by a microphone array. Prior studies explored various deep learning...
Many purely neural network based speech separation approaches have been proposed to improve objective assessment scores, but they often introduce nonlinear distortions that are harmful to modern automatic speech recognition (ASR) systems. Minimum variance distortionless response (MVDR) filters are often adopted to remove nonlinear distortions, howe...
No PDF available
ABSTRACT
Phase is important for speech since it contributes to the quality and intelligibility during speech perception. Many speech enhancement algorithms lack the ability to predict phase for speech reconstruction and apply the noisy phase instead. In this study, we investigated the influence of phase distortion on the speech-qua...
Hearing loss is prevalent among elderly adults, which leads to speech-understanding difficulties in noisy environments. Speech enhancement algorithms are thus proposed to alleviate this speech-in-noise problem. However, most of these algorithms have not been evaluated for hearing-impaired people either with or without the use of hearing aids. In th...
Many speech enhancement algorithms have been proposed over the years and it has been shown that deep neural networks can lead to significant improvements. These algorithms , however, have not been validated for hearing-impaired listeners. Additionally, these algorithms are often evaluated under a limited range of signal-to-noise ratios (SNR). Here,...
Personal hearing devices, such as hearing aids, may be fine-tuned for individual users’ preferences by allowing them to self-adjust the amplification profiles. The purpose of the current study was to compare two self-adjustment methods in terms of their test-retest reliability. Both methods estimated preferred amplification profiles in six octave-f...
Objective speech-quality metrics have been used widely as a tool to evaluate the performance of speech enhancement algorithms. Two widely adopted metrics are Perceptual Evaluation of Speech Quality (PESQ) and Hearing-Aid Speech Quality Index (HASQI). While PESQ is based on a highly-simplified phenomenological model of auditory perception for normal...
A Bayesian adaptive procedure, the interleaved-equal-loudness contour (IELC) procedure, was developed to improve the efficiency in estimating the equal-loudness contour. Experiment 1 evaluated the test-retest reliability of the IELC procedure using six naive normal-hearing listeners. Two IELC runs of 200 trials were conducted and excellent test-ret...
Retina-like imaging system is an imaging system with space-variant resolution similar to the photoreceptor distribution of primate retina. In this paper, the design and implementation of the retina-like imaging system based on non-uniform lens array has been introduced. Firstly, the mathematical model of the non-uniform lens array is deduced. Secon...