Spectrograms for a male speaker at seat "E" in row "A" in the CoRE conference room. The utterance is "The local drugstore was charged with illegally dispensing tranquilizers." (a) Clean speech sent to the input of the loudspeaker. (b) Speech captured from a single microphone approx. 3 meters away. (c) Speech processed through the MFA.

Spectrograms for a male speaker at seat "E" in row "A" in the CoRE conference room. The utterance is "The local drugstore was charged with illegally dispensing tranquilizers." (a) Clean speech sent to the input of the loudspeaker. (b) Speech captured from a single microphone approx. 3 meters away. (c) Speech processed through the MFA.

Source publication
Article
Full-text available
Matched Filter Array (MFA) Processing hasa distinct advantage over delay-sum beamforming for reverberant enclosures in that it is able to cohere significant reflected images in addition to the direct arrivals and is capable of spatial selectivity in three dimensions. However, previous study of this technique utilized a very large number of sensors,...

Contexts in source publication

Context 1
... listening to these results it is noted that the MFA has indeed removed a significant amount of reverberation. Figure 7 shows spectrograms for a male speaker at seat "E" in row "A" in the conference room in Fig- ure 5. The spectrograms are arranged in the same manner as Figure 6. ...
Context 2
... spectrograms are arranged in the same manner as Figure 6. A large amount of spectral smearing due to reverberation is seen in the single- microphone signal of Figure 7b. In Figure 7c, it is seen that the MFA removed a significant amount of the reverberation. ...
Context 3
... large amount of spectral smearing due to reverberation is seen in the single- microphone signal of Figure 7b. In Figure 7c, it is seen that the MFA removed a significant amount of the reverberation. ...

Similar publications

Conference Paper
Full-text available
This paper develops a location based analog beamforming (BF) technique using compressive sensing (CS) to be feasible for millimeter wave (mmWave) wireless communication systems. The proposed scheme is based on exploiting the benefits of CS and localization to reduce mmWave beamforming (BF) complexity and enhance its performance compared with conven...
Article
Full-text available
Physical transceiver implementations for wireless communication systems usually suffer from transmit-radio frequency (Tx-RF) and receiver-RF (Rx-RF) impairments. In this paper, we aim to design efficient coordinated beamforming for multicell multiuser multi-antenna systems by fully taking into account the residual transceiver impairments. Our desig...
Article
Full-text available
Future wireless systems will require higher data rates with better coverage for a wide variety of users operating with a large variety of different systems. To achieve these goals, greater power, interference suppression, and multipath mitigation are needed. Smart antenna gives a promising future for wireless communication systems since it can impr...
Conference Paper
Full-text available
Optimum capacity scaling in the downlink of a single-cell multiple-input multiple-output communication system can be achieved by a communication strategy called opportunistic beamforming in which information carrying beams are randomly formed and users are opportunistically scheduled based on their partial channel state information. Even though opp...

Citations

... Unfortunately, in practical applications, this estimate is still not usable because of its high sensitivity to noise. The second method is termed the "matched filter array-" (MFA-) based algorithm [19,20] in which the impulse response functions are precomputed by exploiting the known geometric relationship between the sound source and an array of sensors, based on the image model method [21,22]. By convolving the captured signal with the precomputed impulse responses, the signal-to-noise ratio (SNR) of a delay-and-sum beamformer could be significantly increased [19,20], however, its computational demand is also significant. ...
... The second method is termed the "matched filter array-" (MFA-) based algorithm [19,20] in which the impulse response functions are precomputed by exploiting the known geometric relationship between the sound source and an array of sensors, based on the image model method [21,22]. By convolving the captured signal with the precomputed impulse responses, the signal-to-noise ratio (SNR) of a delay-and-sum beamformer could be significantly increased [19,20], however, its computational demand is also significant. Due to the high 2 EURASIP Journal on Advances in Signal Processing computational requirement, the real-time application of this method requires a special hardware system [23], thus it has not become widely used. ...
... The source localization problem has led to several proposed signal models which are discussed in [2]. In our work, we utilize a similar signal model that was previously used by Renomeron and his colleagues in [20]. We assume a sound source of point like spatial extent at location s, where s ∈ Cand C is a set of discrete points in three-dimensional space, related to possible sound source locations. ...
Article
Full-text available
Speaker localization with microphone arrays has received significant attention in the past decade as a means for automated speaker tracking of individuals in a closed space for videoconferencing systems, directed speech capture systems, and surveillance systems. Traditional techniques are based on estimating the relative time difference of arrivals (TDOA) between different channels, by utilizing crosscorrelation function. As we show in the context of speaker localization, these estimates yield poor results, due to the joint effect of reverberation and the directivity of sound sources. In this paper, we present a novel method that utilizes a priori acoustic information of the monitored region, which makes it possible to localize directional sound sources by taking the effect of reverberation into account. The proposed method shows significant improvement of performance compared with traditional methods in "noise-free" condition. Further work is required to extend its capabilities to noisy environments.
... In some more recent papers [6], [30], fine-band RTF modeling is performed and simulation results in different bands are assembled together to form the final response. As a particular example, MFA processing [31], [32], [33] relies on fast RIR computation in an attempt to coherently pick up not only the source signal itself but also its significant reflections by filtering the received signal with the time reverse of the RIR for given positions of the source and the receiver, resulting in a significant improvement of the output signal SNR compared to simple beamforming. The image model [11] is used to model the RIR in [33]. ...
Article
Reverberation in rooms is often simulated with the image method due to Allen and Berkley (1979). This method has an asymptotic complexity that is cubic in terms of the simulated reverberation length. When employed in the frequency domain, it is relatively computationally expensive if there are many receivers in the room or if the source or receiver positions are changing with time. The computational complexity of the image method is due to the repeated summation of the fields generated by a large number of image sources. In this paper, a fast method to perform such summations is presented. The method is based on multipole expansion of the monopole source potential. For offline computation of the room transfer function for N image sources and M receiver points, use of the Allen-Berkley algorithm requires O(NM) operations, whereas use of the proposed method requires only O(N+M) operations, resulting in significantly faster computation of reverberant sound fields. The proposed method also has a considerable speed advantage in situations where the room transfer function must be rapidly updated online in response to source/receiver location changes. Simulation results are presented, and algorithm accuracy, speed, and implementation details are discussed. For problems that require frequency-domain computations, the algorithm is found to generate sound fields identical to the ones obtained with the frequency-domain version of the Allen-Berkley algorithm at a fraction of computational cost
... In fact, we note that only few works have touched on this issue. In [10], a matched filter array claiming to have the ability of 3-D local acoustic selectivity was presented. The main limitations of this method are that it requires a large number of sensors (ranging from tens to thousands) and the exact locations of the sound source and the microphone array sensors should be known as a priori. ...
Conference Paper
Full-text available
In this paper, a microphone array with 3-D focal zone is proposed. The microphone array consists of one omni-directional and two uni-directional microphones. The microphone array is so constructed that a cross zone is formed such that only the sound within this zone is captured and any interferences outside the zone are effectively cancelled. The proposed framework is flexible in defining the location/size of the closed volume where the sound source of interest is located. Simulations have been carried out to demonstrate the 3-D spatial selectivity as well as the noise cancellation performance. The most important feature which differs from the previous works is that the super volumetric selectivity is realized by strategically use only three microphones, by which the overall apparatus acts as a virtual wireless close-talking microphone with confined position constrained in both distance and directions.
... The energy before the largest tap, in time, causes pre-echo (an attenuated replica of a signal that arrives before actual signal). Pre-echo degrades the audible quality of the processed waveform [10]. To overcome this problem ( ) c r m is correlated with a truncated version of itself [11]. ...
Conference Paper
We show that even moderate reverberation has a detrimental effect on the audible quality of speech and automatic speech recognition (ASR) accuracy. In the presence of room reverberation, we assess the performance of several important speech enhancement techniques, and show that little improvement is offered. We experimentally show that multiple microphones are necessary for complete equalization of the speaker-to-receiver impulse response. Furthermore, if complete equalization is not possible, long reverberation time (RT60) is shown to affect ASR accuracy far more negatively than a low signal-to-reverberation ratio (SRR). Using this knowledge we develop an equalizing strategy that improves ASR accuracy by reducing RT60
... In some cases it is possible to replace the beamformer and matched filter by a single calculation. This is an ongoing field of research, see for example the matched filter array processing method of Renomeron et al. [111] and the beamspace adaptive matched filter method of Yang et al. [141]. While these methods have the potential to improve performance when used alongside the new techniques presented here, such considerations are outside the scope of this thesis. ...
Article
Noise is frequently encountered when processing data from the natural environment, and is of particular concern for remote-sensing applications where the accuracy of data gathered is limited by the noise present. Rather than merely accepting that sonar noise results in unavoidable error in active sonar systems, this research explores various methodologies to reduce the detrimental effect of noise. Our approach is to analyse the statistics of sonar noise in trial data, collected by a long-range active sonar system in a shallow water environment, and apply this knowledge to target detection. Our detectors are evaluated against imulated targets in simulated noise, simulated targets embedded in noise-only trial data, and trial data containing real targets. First, we demonstrate that the Weibull and K-distributions offer good models of sonar noise in a cluttered environment, and that the K-distribution achieves the greatest accuracy in the tail of the distribution. We demonstrate the limitations of the Kolmogorov-Smirnov goodness-of-fit test in the context of detection by thresholding, and investigate the upper-tail Anderson-Darling test for goodness-of-fit analysis. The upper-tail Anderson-Darling test is shown to be more suitable in the context of detection by thresholding, as it is sensitive to the far-right tail of the distribution, which is of particular interest for detection at low false alarm rates. We have also produced tables of critical values for K-distributed data evaluated by the upper-tail Anderson-Darling test. Having established suitable models for sonar noise, we develop a number of detection statistics. These are based on the box-car detector, and the generalized likelihood ratio test with a Rician target model. Our performance analysis shows that both types of detector benefit from the use of the noise model provided by the K-distribution. We also demonstrate that for weak signals, our GLRT detectors are able to achieve greater probability of detection than the box-car detectors. The GLRT detectors are also easily extended to use more than one sample in a single test, an approach that we show to increase probability of detection when processing simulated targets. A fundamental difficulty in estimating model parameters is the small sample size. Many of the pings in our trial data overlap, covering the same region of the sea. It is therefore possible to make use of samples from multiple pings of a region, increasing the sample size. For static targets, the GLRT detector is easily extended to multi-ping processing, but this is not as easy for moving targets. We derive a new method of combining noise estimates over multiple pings. This calculation can be applied to either static or moving targets, and is also shown to be useful for generating clutter maps. We then perform a brief performance analysis on trial data containing real targets, where we show that in order to perform well, the GLRT detector requires a more accurate model of the target than the Rician distribution is able to provide. Despite this, we show that both GLRT and box-car detectors, when using the K-distribution as a noise model, can achieve a small improvement in the probability of detection by combining estimates of the noise parameters over multiple pings.
... In some more recent papers [6], [30], fine-band RTF modeling is performed and simulation results in different bands are assembled together to form the final response. As a particular example, MFA processing [31], [32], [33] relies on fast RIR computation in an attempt to coherently pick up not only the source signal itself but also its significant reflections by filtering the received signal with the time reverse of the RIR for given positions of the source and the receiver, resulting in a significant improvement of the output signal SNR compared to simple beamforming. The image model [11] is used to model the RIR in [33]. ...
Article
Full-text available
Reverberation in rooms is often simulated with the image method due to Allen and Berkley (1979). This method has asymptotic complexity that is cubic in terms of the simulated reverberation length. When employed in the frequency domain, it is relatively computationally expensive to use for many receivers in the room or in a dynamically changing configuration due to the repeated summation of the fields generated by a large number of image sources. In this paper, a fast method to perform such summations is presented. The method is based on the multipole expansion of the monopole source potential. For offline computation of the room transfer function for N image sources and M receiver points, use of Allen-Berkley algorithm requires O(NM) operations, whereas use of the proposed method requires only O(N + M) operations, resulting in significantly faster computation of reverberant sound fields. The proposed method also has considerable speed advantage in situations where the room transfer function must be rapidly updated online in response to the source/receiver location changes. Simulation results are presented, and algorithm accuracy, speed, and implementation details are discussed. For problems that require frequency-domain computations, the algorithm is found to generate sound fields that are identical to the ones obtained with the frequency-domain version of the Allen-Berkley algorithm at a fraction of computational cost.
... It thus enables deployment o f D ARPA s p e e c h recognition technology in hands-busy/eyes-busy and/or distant-talking applications. 1 The combined advantages of microphone arrays and neural network (NN) computing are used to expand the capabilities of DARPA speech recognition technology to application environments where users must not be encumbered by body-worn or hand-held microphones, and must have freedom of movement. (Examples include Combat Information Centers, large group conferences, and mobile hands-busy eyes-busy maintenance tasks.) ...
... The use of NN reduces the error. From Table III, the feedforward-network con guration, wdnet (5,4,1), yields the smallest error of 0.0775, or 35% improvement. It is also seen that, for this dataset, the feedforward non-linear network outperforms the adaline structure, but at the cost of considerably more computation per training epoch. ...
... These states are labeled as f1 2 : : : N g and the state at time t as q t .3. Let the observations in any state be characterized by a m ulti-variate Gaussian Probability distribution (which signi es that the observations are not discrete but continuous1 The state transition probability is de ned as the probability o f m o ving from state i to j in one time step, can be concisely represented in a matrix form with a ij as the i th row and j th column of a matrix A.6. The initial state probabilities are de ned as, i = P(q 1 = i) for 1 i N:Thus a complete speci cation of the model would require the parameters N M A B and .In a compact notation the model is denoted as, Given a set of training data, how can the model be estimated.Problem 1 : If this were to directly solve it can shown that, requires about 2T N T computations which is quite unreasonable. ...
Chapter
Natural communication with machines is a crucial factor in bringing the benefits of networked computers to mass markets. In particular, the sensory dimensions of sight, sound and touch are comfortable and convenient modalities for the human user. New technologies are now emerging in these domains that can support human/machine communication with features that emulate face-to-face interaction. A current challenge is how to integrate the, as yet, imperfect technologies to achieve synergies that transcend the benefit of a single modality. Because speech is a preferred means for human information exchange, conversational interaction with machines will play a central role in collaborative knowledge work mediated by networked computers. Utilizing speech in combination with simultaneous visual gesture and haptic signaling requires software agents able to fuse the error-susceptible sensory information into reliable interpretations that are responsive to (and anticipatory of) human user intentions. This report draws a perspective on research in human/machine communication technologies aimed to support computer conferencing and collaborative problem solving.
Conference Paper
In this paper a binaural noise reduction system based on the adaptive matched filter array (MFA) and post-filtering is presented. The binaural MFA filter is formed using the interaural impulse response between left and right microphone signal which is estimated by means of the NLMS algorithm. The residual noise reduction by three post-filter types is then compared and evaluated according to objective and subjective measures. Furthermore, the performance of the algorithm for preserving binaural cues is discussed.
Conference Paper
This paper presents our laboratory’s Intelligent Room, an experimental environment for bringing computation into the realm of ordinary, everyday activity and enabling natural human-computer interaction. We first present the notion of an Intelligent Environment and describe how it differs from other paradigms in human-computer interaction. We then discuss our Intelligent Room’s hardware and software components and one of the room’s current applications. We also outline criteria for designing and evaluating highly interactive spaces.