Fig 4 - uploaded by B. Ziolko
Content may be subject to copyright.
Source publication
This paper suggests a speech enhancement approach to an eavesdropping audio system. Speech signal is disturbed by non-stochastic noise. The algorithm is based on recordings from dual-microphone system. The Wiener filter was applied for speech extraction. The algorithm is designed to capture dialogues in noisy environment as well. It uses the small...
Context in source publication
Context 1
... to the long computation time. On the other hand, low filter order can be not sufficient to compensate the difference in distance between microphones. According to (3), every incrementing of the filter order enables compensation of a distance longer than ρ ≈ 7.5 mm. Taking above into consideration in the conducted experiments N = 100 was assumed. Fig. 4 presents the Wiener filter adaptation. Fig. 5 shows values of Wiener filter coefficients after 1 second of adaptation process. The maximum value position corresponds to a physical difference in distances. The criterion of evaluation of the algorithm was to increase Voice-To-Music Ratio (VMR). It was assumed, that the voice signal is ...
Citations
... Distance between neighbouring microphones is equal to d and the distance between speaker A and microphone 1 is equal to r A,I. [1], [2], [3]. Some of them consider conference and diarization systems. ...
Some of existing conference system employ a distant microphone array instead of microphones dedicated for each user. This approach is much more convenient although suffers from much higher noise sensitivity. One of the possible solutions is employing beamforming techniques to focus on the user that is speaking at the moment. However, beamformer needs information about the direction of arrival (DOA) parameter which is usually provided by analysing the phase differences between signals. Effectiveness of such solution decrease dramatically when the environment becomes noisy. In this paper, a novel, robust meetings diarization system is described. The decision about which user is speaking at the moment is based not only on spacial features of signal (i.e., speaker's localization) but also on spectral features. The microphone array estimates speaker localization employing generalized cross-correlation with phase transform (GCC-PHAT). Additionally, the speaker recognition system which employs wavelet-Fourier transform (WFT) extracts spectral features of voice. Described solution is much more robust than the one basing on speaker recognition or speaker localization only. The experiments during meetings in regular meeting room show that it is less noise sensitive and the switching between speakers is several times faster.
... Speech enhancement can be based on single-channel (e.g. spectral subtraction) or multi-channel techniques such as adaptive noise cancellation [3], blind source separation or beamforming. Despite quite impressive performance presented by all these methods we are still far away from the efficiency with of human in terms of speech recognition. ...
In this paper filer-based model of sound propagation combined with a model of multimicrophone array is presented. Simple operations employed in the model such as sum and linear convolution allows one to implement it easily using numerical methods. Additionally, described solution was discussed in connection with filter-based model of multimicrophone array. The usefulness of the models was proved in the context of filter-and-sum beamformer synthesis. Practical methods of calculating filters for 1-D multi-microphone arrays were stated. Then the results were employed to synthesis filters for wideband four-microphone array.