Article

Estimation of modal decay parameters from noisy response measurements

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The estimation of modal decay parameters from noisy measurements of reverberant and resonating systems is a common problem in audio and acoustics, such as in room and concert hall measurements or musical instrument modeling. Reliable methods to estimate the initial response level, decay rate, and noise floor level from noisy measurement data are studied and compared. A new method, based on the nonlinear optimization of a model for exponential decay plus stationary noise floor, is presented. A comparison with traditional decay parameter estimation techniques using simulated measurement data shows that the proposed method outperforms in accuracy and robustness, especially in extreme SNR conditions. Three cases of practical applications of the method are demonstrated.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... One might obtain initial guesses in many ways. For example, Mäkivirta et al. [7] estimate the resonant frequencies from the resonant peaks in the short-term Fourier transform of the transfer function, and estimate the damping coefficients by fitting an exponential decay model to the decaying signal at each estimated resonant frequency, as proposed by Karjalainen et al. [21]. This approach might be used when one is interested in estimating the resonant frequencies and damping coefficients of specific resonant peaks. ...
... In such cases, initial guesses of the damping coefficients can also be utilized to improve estimates. The method provided by Karjalainen et al. [21] can be used to obtain good estimates. ...
... In this work, eigenvalue analysis of the undamped room, and damping coefficient estimation from a room impulse response [21], are used to generate initial guesses. ...
Conference Paper
The acoustic eigenvalues of an enclosure quantify its resonant frequencies and damping coefficients. This information can be used to determine the acoustic characteristics of a room. Various methods for estimating eigen-values from a measured room impulse response can be found in the literature. Of these methods, the matrix pencil method is considered in this work. The matrix pencil method provides acceptable eigenvalue estimates, albeit alongside several spurious eigenvalues. In this work, the appearance of spurious eigenvalues is demonstrated by analyzing an analytic impulse response. Additionally, the use of the well-known Rayleigh quotient iteration algorithm to search for eigenvalues close to informed initial guesses and, thus, to minimize the estimation of spurious eigenvalues is presented. The approach is verified using finite element eigensolutions of a room.
... b) Also at: Media Lab, Department of Art and Media, Aalto University, Otakaari 5, 00250 Espoo, Finland. the reverberation time estimation 3,5,7,8 . Alternatively, Xiang 9 and Karjalainen et al. 10 proposed to include an additional noise term in the model and perform a nonlinear regression. ...
... In this case, the noise term of the model is neglected, i.e., N 0 = 0. To get accurate estimates for T 1 , the effect of the noise has to be countered by noise subtraction 5 or truncation of the RIR before backwards integration 3,7,8 . The noise term N 0 can be included in the model by using non-linear regression 9,10 . ...
... We compare the DecayFitNet with two other decay analysis approaches. The first baseline method is based on a non-linear regression model of a single exponential and a noise term 10 . The second baseline method is based on slice sampling for Bayesian decay analysis 20 . ...
Preprint
Full-text available
An established model for sound energy decay functions (EDFs) is the superposition of multiple exponentials and a noise term. This work proposes a neural-network-based approach for estimating the model parameters from EDFs. The network is trained on synthetic EDFs and evaluated on two large datasets of over 20000 EDF measurements conducted in various acoustic environments. The evaluation shows that the proposed neural network architecture robustly estimates the model parameters from large datasets of measured EDFs, while being lightweight and computationally efficient. An implementation of the proposed neural network is publicly available.
... One might obtain initial guesses in a number of ways. For example, Mäkivirta et al. [6] estimate the resonance frequencies from the resonance peaks in the short-term Fourier transform of the transfer function and estimate the damping constants by fitting an exponential decay model to the decaying signal at each estimated resonance frequency, as proposed by Karjalainen et al. [25]. This approach might be used to estimate the resonance frequencies and damping constants of specific resonance peaks. ...
... In such cases, initial guesses of the damping constants can also be utilized to improve estimates. The method provided by Karjalainen et al. [25] can be used to obtain reasonable estimates. Alternatively, an energy decay curve and, subsequently, the reverberation time can be computed from the room impulse response [27]. ...
Article
Various methods for estimating acoustic eigenvalues from a measured signal can be found in the literature. Among these methods, the matrix pencil method is a popular choice. The matrix pencil method provides reasonable eigenvalue estimates, albeit alongside several spurious estimates. This work uses Rayleigh quotient iteration to avoid these spurious values. Additionally, a sparse matrix structure is used to write the matrix pencil. The approach is verified using simulated data and validated using measured data. When reliable initial guesses for a few eigenvalues are available, the proposed method provides efficient and accurate eigenvalue estimates.
... In addition, using only positive amplitudes can only model the monotonically decaying behavior, not the initial increase. Thus, we propose an improved FADE-IN model, which performs fitting on the RIR envelopes, as demonstrated by [20], and relaxes the optimization constraint to allow negative amplitude values. Despite allowing negative amplitudes, the objective function is still constrained, such that the total sum of the exponentials remains positive. ...
... The function f (·) applied to the envelopes is a power law scaling function with factor of 0.5, given by f (y(t)) = [y(t)] 0.5 = y(t). This scaling has been shown to be an effective compromise between the linear and the logarithmic scaling [20]. ...
Preprint
Full-text available
In multi-room environments, modelling the sound propagation is complex due to the coupling of rooms and diverse source-receiver positions. A common scenario is when the source and the receiver are in different rooms without a clear line of sight. For such source-receiver configurations, an initial increase in energy is observed, referred to as the "fade-in" of reverberation. Based on recent work of representing inhomogeneous and anisotropic reverberation with common decay times, this work proposes an extended parametric model that enables the modelling of the fade-in phenomenon. The method performs fitting on the envelopes, instead of energy decay functions, and allows negative amplitudes of decaying exponentials. We evaluate the method on simulated and measured multi-room environments, where we show that the proposed approach can now model the fade-ins that were unrealisable with the previous method.
... Prior work on RIR estimation mainly deals with recorded audio signals and does not take into account visual cues. Directly estimating the RIR from source reverberant speech has been extensively studied using traditional signal processing methods [18,25,35,48]. However, these ap-This CVPR paper is the Open Access version, provided by the Computer Vision Foundation. ...
... proaches may not work well in some real-world applications, mainly because they are based on the assumption that the source is a modulated Gaussian pulse, not actual speech, [25,35] or they require pre-knowledge of the specific attributes about the speaker or the microphone used for recording [18,48]. Recently, neural learning-based RIR estimation techniques have been proposed to estimate RIR from reverberant speech [69,81]. ...
Conference Paper
Full-text available
Accurate estimation of Room Impulse Response (RIR), which captures an environment's acoustic properties, is important for speech processing and AR/VR applications. We propose AV-RIR, a novel multi-modal multi-task learning approach to accurately estimate the RIR from a given re-verberant speech signal and the visual cues of its corresponding environment. AV-RIR builds on a novel neural codec-based architecture that effectively captures environment geometry and materials properties and solves speech dereverberation as an auxiliary task by using multi-task learning. We also propose Geo-Mat features that augment material information into visual cues and CRIP that improves late reverberation components in the estimated RIR via image-to-RIR retrieval by 86%. Empirical results show that AV-RIR quantitatively outperforms previous audio-only and visual-only approaches by achieving 36%-63% improvement across various acoustic metrics in RIR estimation. Additionally, it also achieves higher preference scores in human evaluation. As an auxiliary benefit, dereverbed speech from AV-RIR shows competitive performance with the state-of-the-art in various spoken language processing tasks and outperforms reverberation time error score in the real-world AVSpeech dataset. Qualitative examples of both synthesized reverberant speech and enhanced speech are available online 1 .
... Therefore, the initial onset and noise floor bias at the beginning and the end of the envelope should be excluded from the fitting interval, respectively. However, in cases where the useful dynamic range of the measured data is severely limited, alternative fitting approaches such as non-linear decay plus noise models or a dedicated noise compensation stage before ETC calculation may be more appropriate [4]- [6] . Additionally, the stationary nature of the FFT calculation within individual STFT analysis windows may distort the true decay envelope of the mode peaks to some extent [7] . ...
... However, due to the high variability of room transfer functions with source and receiver positions, room equalisation systems typically only aim to improve the system performance over a pre-determined listening area. Examples include the parametric equalisation of individual room modes [4], [5] and simple adjustments such as tone and low-frequency tilt controls found on amplifiers and AV receivers. ...
Conference Paper
Full-text available
Below their Schroeder frequencies, most domestic listening rooms suffer from detrimental room contributions due to standing wave modes. Passive LF treatments such as resonant absorbers and bass traps are often impractical in such spaces due to their large footprint or limited operational bandwidth requiring multiple tuned units to achieve wideband improvement. As a result, acoustic room correction solutions have been an area of significant commercial and research interest over the last two decades. Commonly, acoustic room correction systems follow one of the two principal approaches - a. correction by pre-processing the signal to be reproduced and b. correction of the acoustic environment to reduce or limit its influence on the reproduced sound. In this paper, a comprehensive analysis scheme compares and contrasts various features of the two types of correction systems in the time, frequency, and spatial domains by means of numerical room modelling and characterisation of meaningful performance metrics.
... For each position in a room, one randomly selected noise from 5 noises was added to each utterance with a SNR selected from 0 dB, 5 dB, 10 dB, 15 dB and 20 dB. 150 noisy reverberant speech signals were used for each position and 5 positions were mesured in each room, yielding 3000 utterances in total. The ground truth of the T 60 was calculated from the RIR in this position [38]. To validate the calculated T 60 using RIR , ...
... The T 60 ranges from 0.2 s to 1.5 s. The ground truth of T 60 is calculated using the method in[38]. Two unseen room sizes are simulated to generate 234 synthetic RIRs for the validation set. ...
Preprint
The reverberation time is one of the most important parameters used to characterize the acoustic property of an enclosure. In real-world scenarios, it is much more convenient to estimate the reverberation time blindly from recorded speech compared to the traditional acoustic measurement techniques using professional measurement instruments. However, the recorded speech is often corrupted by noise, which has a detrimental effect on the estimation accuracy of the reverberation time. To address this issue, this paper proposes a two-stage blind reverberation time estimation method based on noise-aware time-frequency masking. This proposed method has a good ability to distinguish the reverberation tails from the noise, thus improving the estimation accuracy of reverberation time in noisy scenarios. The simulated and real-world acoustic experimental results show that the proposed method significantly outperforms other methods in challenging scenarios.
... For all of the following analyses, the reverberation times were determined in octave-bands from the omnidirectional channel of the higher-order Ambisonic RIR measurements by fitting a non-linear decay-plus-noise model according to [21]. The total absorption areas of the room configurations were calculated by summing the absorption areas of all room and furniture object surfaces without considering occluded surfaces. ...
... fitting algorithm returned incorrect reverberation times for room configurations that feature double-slope decays. The non-linear decay-plus-noise model, as implemented in [21], fitted neither of the two slopes correctly, thus resulting in an averaged slope somewhere between the two slopes. Figure 6 illustrates this fitting problem with one of the dataset's doubleslope decays. ...
Conference Paper
This paper presents Motus, a new dataset of higher-order Ambisonic room impulse responses. The measurements took place in a single room while varying the amount and placement of furniture. 830 different room configurations were measured with four source-to-receiver configurations, resulting in 3320 room impulse responses in total. The dataset features various furniture object placements, including non-uniform distributions of absorptive material and cases with occluded direct paths between source and receiver. All acoustic measurements are accompanied by matching 3D models and 360 degree-photographs of the room. After describing the dataset, we demonstrate its usage with a reverberation time analysis. The analysis reveals that most of our measurements follow the expected relationship between absorption area and reverberation time. Some exceptional cases feature particular room acoustic phenomena, such as non-uniform absorption area distributions or multi-slope decays. Additionally, we show with a large number of measurements that furniture placement can significantly affect the reverberation time of a room. The dataset can be used to investigate room acoustic topics such as the acoustic effects of absorber placements or the decay behavior of rooms.
... We provide two acoustic parameters: (i) Reverberation time, T60 [23] and (ii) Clarity, C50 [24] for all audio clips in clean speech of training set. We provide T60, C50 and isReal Boolean flag for all RIRs where isReal is 1 for real RIRs and 0 for synthetic ones. ...
... The two parameters are correlated. An RIR with low C50 can be described as highly reverberant and vice versa [23,24]. These parameters are supposed to provide flexibility to researchers for choosing a sub-set of provided data for controlled studies. ...
... Initially, the objective parameters like reverberation time, percentage articulation (PA) [1], decay rates [2], and statistical measures of room impulse responses (RIR) [3] were the only measures of reverberation. However, later studies [4,5] found that these measures vary with the sound frequency and wall surface properties. ...
... By examining E H d H * d based on (2), where E{·} represents the statistical expectation operator, we can express the direct path power as ...
Article
Full-text available
Featured Application Room Mode Analysis. Abstract Modal decays and modal power distribution in acoustic environments are key factors in deciding the perceptual quality and performance accuracy of audio applications. This paper presents the application of the eigenbeam spatial correlation method in estimating the time-frequency-dependent directional reflection powers and modal decay times. The experimental results evaluate the application of the proposed technique for two rooms with distinct environments using their room impulse response (RIR) measurements recorded by a spherical microphone array. The paper discusses the classical concepts behind room mode distribution and the reasons behind their complex behavior in real environments. The time-frequency spectrum of room reflections, the dominant reflection locations, and the directional decay rates emulate a realistic response with respect to the theoretical expectations. The experimental observations prove that our model is a promising tool in characterizing early and late reflections, which will be beneficial in controlling the perceptual factors of room acoustics.
... As opposed to [18], [25], we average the EDC computed on a set C of 29 one-third octave bands ranging from 20 Hz to 12.5 kHz. When using backward integration, background noise affects the entire EDC, leading to a vertical displacement at the beginning of the EDC [26]. To avoid emphasizing differences in noise level, all EDCs are normalized to 0 dB prior to computing (3). ...
... Here, Brandewie and Zahorik [24] showed, that after being exposed to the reverberant environment, speech reception thresholds improved, which indicates some level of compensation for the reverberation. They used the term "moderate" to describe a room that approximates the acoustic properties of a large office, with a broadband reverberation time (T60, [25]) of 0.42 s and a broadband clarity index (C50, [26]) of 13.4 dB, representing the ratio of the energy in the early reflections over the energy in the late reflections. ...
Article
Full-text available
Virtual acoustics enables hearing research and audiology in ecologically relevant and realistic acoustic environments, while offering experimental control and reproducibility of classical psychoacoustics and speech intelligibility tests. Hereby, indoor environments are highly relevant, where listening and speech communication frequently involve multiple targets and interferers, as well as connected adjacent spaces that may create challenging acoustics. Hence, a controllable laboratory environment is evaluated here (by room acoustical parameters and speech intelligibility) which closely resembles a typical German living room with an adjacent kitchen. Target and interferer positions were permuted over four different locations, including an acoustically challenging position of a target in the kitchen with interrupted line of sight. Speech intelligibility was compared in the real room, in virtual acoustic representations, and in standard anechoic audiological configurations. Three presentation modes were tested: headphones, loudspeaker rendering on a small-scale, four-channel loudspeaker array in a sound-attenuated listening booth, and a three-dimensional 86-channel loudspeaker array in an anechoic chamber. The results showed that the target talker in the coupled room requires higher signal to noise ratios (SNRs) at threshold than typical indoor conditions. Moreover, for the stationary speech shaped interferer, effects of room acoustics were negligible. For a majority of target positions, no difference between the four-channel and the large-scale loudspeaker array were found, with an overall good agreement to the real room. This indicates that ecologically valid testing is feasible using a clinically applicable small-scale loudspeaker array.
... For the T60 aware model, we extracted T60 values directly from the RIRs (oracle T60s) using a method similar to the one used for the ACE Challenge dataset [19], [20]. The fullband T60 we used was computed as the average of the estimates for the bands with center frequencies of 400 Hz, 500 Hz, 630 Hz, 800 Hz, 1000 Hz, and 1250 Hz. ...
Preprint
In this paper, we propose a model to perform speech dereverberation by estimating its spectral magnitude from the reverberant counterpart. Our models are capable of extracting features that take into account both short and long-term dependencies in the signal through a convolutional encoder (which extracts features from a short, bounded context of frames) and a recurrent neural network for extracting long-term information. Our model outperforms a recently proposed model that uses different context information depending on the reverberation time, without requiring any sort of additional input, yielding improvements of up to 0.4 on PESQ, 0.3 on STOI, and 1.0 on POLQA relative to reverberant speech. We also show our model is able to generalize to real room impulse responses even when only trained with simulated room impulse responses, different speakers, and high reverberation times. Lastly, listening tests show the proposed method outperforming benchmark models in reduction of perceived reverberation.
... Unfortunately, in many cases the background noise is not negligible, meaning that the ensemble mean of ln f 2 (t) will not adhere to straight line. Approaches for non-linear regression, where the influence of the noise is taken into account, has been proposed for nonblind reverberation time estimation [21,22]. However, non-linear regression requires numerical optimization and is thus demanding, especially compared to simple linear regression, in terms of computational complexity. ...
Preprint
The estimation of the decay rate of a signal section is an integral component of both blind and non-blind reverberation time estimation methods. Several decay rate estimators have previously been proposed, based on, e.g., linear regression and maximum-likelihood estimation. Unfortunately, most approaches are sensitive to background noise, and/or are fairly demanding in terms of computational complexity. This paper presents a low complexity decay rate estimator, robust to stationary noise, for reverberation time estimation. Simulations using artificial signals, and experiments with speech in ventilation noise, demonstrate the performance and noise robustness of the proposed method.
... The recordings were conducted at different sampling rates, but are downsampled to 16 kHz in the released dataset. The reverberation time, measured according to [Kar+02], spanned the range T 60 ∈ [0.05, 1.5] s, where the majority of the recordings had reverberation times T 60 < 0.5 s. It is important to note that while previous sections and experiments used a single nonlinearity for both training and testing, the AEC challenge dataset includes a large number of nonlinearities due to the use of a large number of audio devices in recording the echo signals. ...
... Another aspect to consider is the analysis method used to estimate the T60 from a measured RIR. If performed incorrectly, the estimation of the RIR energy decay can be hindered by background noise and multi-slope decay [17]. As such, Bayesian analysis proved suitable to model more complex behaviors in RIRs [18]. ...
Conference Paper
Full-text available
This paper seeks to improve the state-of-the-art in delay-network-based analysis-synthesis of measured room impulse responses (RIRs). We propose an informed method incorporating improved energy decay estimation and synthesis with an optimized feedback delay network. The performance of the presented method is compared against an end-to-end deep-learning approach. A formal listening test was conducted where participants assessed the similarity of reverberated material across seven distinct RIRs and three different sound sources. The results reveal that the performance of these methods is influenced by both the excitation sounds and the reverberation conditions. Nonetheless, the proposed method consistently demonstrates higher similarity ratings compared to the end-to-end approach across most conditions. However, achieving an indistinguishable synthesis of measured RIRs remains a persistent challenge, underscoring the complexity of this problem. Overall, this work helps improve the sound quality of analysis-based artificial reverberation.
... As opposed to [18], [25], we average the EDC computed on a set C of 29 one-third octave bands ranging from 20 Hz to 12.5 kHz. When using backward integration, background noise affects the entire EDC, leading to a vertical displacement at the beginning of the EDC [26]. To avoid emphasizing differences in noise level, all EDCs are normalized to 0 dB prior to computing (3). ...
Preprint
Full-text available
Automatic tuning of reverberation algorithms relies on the optimization of a cost function. While general audio similarity metrics are useful, they are not optimized for the specific statistical properties of reverberation in rooms. This paper presents two novel metrics for assessing the similarity of late reverberation in room impulse responses. These metrics are differentiable and can be utilized within a machine-learning framework. We compare the performance of these metrics to two popular audio metrics using a large dataset of room impulse responses encompassing various room configurations and microphone positions. The results indicate that the proposed functions based on averaged power and frequency-band energy decay outperform the baselines with the former exhibiting the most suitable profile towards the minimum. The proposed work holds promise as an improvement to the design and evaluation of reverberation similarity metrics.
... 3. From the band-limited energy decay curve, a j th damping constant, σ j , is computed using the method by Karjalainen et al. [18]. ...
Conference Paper
Full-text available
To accurately simulate interior acoustical fields, reliable descriptions of the physical properties of an acoustical space are required. While geometry, propagation medium properties, and source characteristics are, in general, easily accessible, the absorption characteristics of the bounding surfaces of a space can be difficult to obtain. Databases of absorption coefficients could be used for commonly used materials. However, absorption coefficient databases are of limited use at low frequencies, especially below 100 Hz. Furthermore, measurement-based estimations tend to be more reliable. Thus, there is an incentive to perform measurement-based estimations of the absorption characteristics of a room’s bounding surfaces at low frequencies. This work considers the estimation of sound absorption, given in terms of locally-reacting surface impedance, using two inverse methods. The first method utilizes the acoustic diffusion equation, and the second solves an inverse eigenvalue problem based on the Helmholtz equation. Spatially uniform, frequency-dependent, real-valued impedances of the bounding surfaces of a reverberation chamber are estimated. The methods are evaluated by comparing measured and simulated transfer functions and reverberation times, which are predicted using the estimated impedances. While the proposed methods can estimate locally-reacting surface impedance from measured room impulse responses at low frequencies, there is a margin for improvement.
... For noisy decays or decays with multiple decay rates, the parameter estimation requires more advanced approaches. Although the decay amplitudes A k,x and noise amplitude N 0,x occur as linear coefficients in Equation (2.3), nonlinear parameter estimation approaches are required for fitting the model, because the decay times T k,x appear in the exponent [56,57]. To this end, Xiang and Goggans proposed a Bayesian formalism for sound-energy-decay analysis [11], which can also be used to determine the model order [75,76]. ...
Thesis
Full-text available
The study of room acoustics has traditionally been of interest in architectural planning and design. With the spread of virtual- and augmented-reality technology, room-acoustic modelling has also become increasingly relevant for audio engines. The dynamic and fast-paced nature of such applications requires rendering systems to operate in real-time. However, accurate state-of-the-art room-acoustic-simulation technology is often computationally expensive, limiting its use for audio engines. Data-driven methods offer the potential to bypass expensive simulations, while ensuring convincing perceptual experiences. This dissertation works towards data-driven audio engines by exploring the interaction between room-acoustic modelling and data-driven methods. It comprises five peer-reviewed publications that investigate automatic data acquisition, robust room-acoustic analysis in complex environments, and data-driven room-acoustics rendering. As sound propagates through a room, it interacts with various surfaces, leading to a gradual energy decay over time. The properties of this energy decay significantly influence the acoustic impression evoked by a room, making it a widely studied topic in room-acoustic research. The first part of this thesis provides an overview of sound-energy decay, its analysis, and challenges associated with complex geometries featuring multiple rooms and non-uniform absorption-material distributions. To this end, it introduces a neural network for multi-exponential sound-energy-decay analysis. Moreover, spatial and directional variations of sound-energy decay are investigated, and a compact representation to model them is proposed. The second part of this thesis is centred around data-driven methods and explores how they can be applied to room-acoustics research. After elaborating on the properties of room-acoustic data, techniques for its large-scale acquisition are investigated. Two of the contained publications describe autonomous robot systems for conducting room-acoustic measurements. While the first one describes the general idea and the design constraints of a practical system, the second one extends the measurement strategy to complex geometries featuring multiple connected rooms. An overview of commonly used machine-learning concepts is provided, focusing on the ones relevant for the included publications. Finally, several applications of data-driven methods in room-acoustics research are described, including a summary of a late-reverberation rendering system proposed in one of the appended publications.
... Additionally, we will strive to collect more comprehensive and diverse room data to enhance the model's generalization capabilities. We also aim to update robust and state-of-theart RT 60 estimators [50][51][52] to obtain more accurate ground truth. These efforts will contribute to advancing the application of attention-based audio processing models in real-world scenarios. ...
Article
Full-text available
Dynamic parameterization of acoustic environments has drawn widespread attention in the field of audio processing. Precise representation of local room acoustic characteristics is crucial when designing audio filters for various audio rendering applications. Key parameters in this context include reverberation time (RT6060_{60}) and geometric room volume. In recent years, neural networks have been extensively applied in the task of blind room parameter estimation. However, there remains a question of whether pure attention mechanisms can achieve superior performance in this task. To address this issue, this study employs blind room parameter estimation based on monaural noisy speech signals. Various model architectures are investigated, including a proposed attention-based model. This model is a convolution-free Audio Spectrogram Transformer, utilizing patch splitting, attention mechanisms, and cross-modality transfer learning from a pretrained Vision Transformer. Experimental results suggest that the proposed attention mechanism-based model, relying purely on attention mechanisms without using convolution, exhibits significantly improved performance across various room parameter estimation tasks, especially with the help of dedicated pretraining and data augmentation schemes. Additionally, the model demonstrates more advantageous adaptability and robustness when handling variable-length audio inputs compared to existing methods.
... Focusing more on the time domain, modal equalisation has recently grown in popularity as a research topic for reducing modal decay time 14 -16 . The decay time is identified 17,18 and a modal equalisation technique reduces the pole radii of the modes in the overall transfer function 16 . An alternative method finds peaks in the low frequency response, assumes they are due to resonances and introduces notch filters to reduce the decay time 19 . ...
... Reverberation time is the duration it takes for the sound pressure level to decay by 60 dB after the source ceases. From the Schroeder curve generated through backward integration of the squared RIR, we calculate the reverberation time (T 20 ) by extrapolating the least-squares fit between −5 dB and −25 dB following the ISO 3382 recommendation [31,32]. Figure 5a,b present the T 20 values across frequency in octave bands, derived from RIRs measured by the Rode NT-SF1 microphone and em32 Eigenmike, respectively. ...
Article
Full-text available
This paper introduces RSoANU, a dataset of real multichannel room impulse responses (RIRs) obtained in a recording studio. Compared to the current publicly available datasets, RSoANU distinguishes itself by featuring RIRs captured using both a 32-channel spherical microphone array (mh acoustics em32 Eigenmike) and a B-format soundfield microphone array (Rode NT-SF1). The studio incorporates variable wall panels in felt and wood options, with measurements conducted for two configurations: all panels set to wood or felt. Three source positions that emulate typical performance locations were considered. RIRs were collected over a planar receiver grid spanning the room, with the microphone array centered at a height of 1.7 m. The paper includes an analysis of acoustic parameters derived from the dataset, revealing notable distinctions between felt and wood panel environments. Felt panels exhibit faster decay, higher clarity, and superior definition in mid-to-high frequencies. The analysis across the receiver grid emphasizes the impact of room geometry and source–receiver positions on reverberation time and clarity. The study also notes spatial variations in parameters obtained from the two microphone arrays, suggesting potential for future research into their specific capabilities for room acoustic characterization.
... . 잔향시간 T60은 음원이 차단된 후 음성의 에너지가 60 dB 감쇠하는데 소요되는 시간으로 정의되며 (Kuttruff, 2019), 전통적으로 실내 임펄스 응답(room impulse response, RIR)으로부터 T60을 구하 는 방법이 잘 정립되어 있다 (Karjalainen et al., 2002). 하지만 RIR 을 구하기 어려운 상황에서는 이러한 방법을 적용하는 것이 불 가능하기 때문에, 오직 수집된 음성 신호로부터 T60을 추정하 는 블라인드 T60 추정 방식들이 제시되고 있다 (Bryan, 2020;Deng et al., 2020;Eaton & Naylor, 2015a;Eaton et al., 2013Eaton et al., , 2016Gamper & Tashev, 2018;Löllmann et al., 2015;Prego et al., 2015;Xiong et al., 2018;Zheng et al., 2022). ...
... 1. peak finding is performed on the transfer function (given by Fourier transform of the impulse response) to identify candidate modal frequencies, 2. the modal frequencies of a rigid walled version of the impedance tube (idealized as a cylindrical duct) are computed analytically, and used to refine the set of candidate modal frequencies, 3. for each candidate modal frequency, f , a related damping coefficient, σ, is computed, using the short-time Fourier transform and the approach presented by Karjalainen et al. [25], 4. the estimated modal frequencies and damping coefficients are combined to give a set of initial estimates for the eigenvalues, λ 0 = i(2πf + iσ), and 5. the matrix pencil is constructed, and the initial guesses are used to perform Rayleigh quotient iteration to find a set of candidate eigenvalues,λ, following Ref. [18]. ...
Conference Paper
For computational room acoustics, accurate surface impedance data is needed to generate computational models that aim to provide accurate predictions. However, obtaining complex-valued frequency-dependent impedances of an acoustically absorbing material at low frequencies is a challenging task. In point of fact, the current measurement standard ISO 354:2003, which describes the measurement of absorption coefficients (which can be related to impedance) in a reverberation chamber, states that it is difficult to obtain reliable data below 100 Hz. There is therefore a need for advanced low-frequency measurement techniques. This paper presents a validation of a recently proposed eigenvalue-based inverse method for estimating locally reacting impedances at modal frequencies. The proposed method is validated using data measured in an impedance tube. This method can be used to estimate the sample impedance in reverberant rooms at low frequencies.
... This work is typically done in the context of sound synthesis, but is equally valid for the proposed sensor equalization application. The mode parameters can be fit using traditional mode fitting techniques such as the Complex Exponential or Peak Picking methods [8,9,10]. The modal fits can be improved using a constrained optimization DAFX-1 DAFx-304 algorithm to reduce the error between the experimental and reconstructed frequency response functions [5,11]. ...
Conference Paper
Full-text available
This paper proposes a method to filter the output of instrument contact sensors to approximate the response of a well placed microphone. A modal approach is proposed in which mode frequencies and damping ratios are fit to the frequency response of the contact sensor, and the mode gains are then determined for both the contact sensor and the microphone. The mode frequencies and damping ratios are presumed to be associated with the resonances of the instrument. Accordingly, the corresponding contact sensor and microphone mode gains will account for the instrument radiation. The ratios between the contact sensor and microphone gains are then used to create a parallel bank of second-order biquad filters to filter the contact sensor signal to estimate the microphone signal.
... Several algorithms have been proposed to blindly estimate an RIR from a reverberant source signal using traditional signal-processing approaches [11-13, 17, 18]. For some methods that take an ℓ 1 -norm-based approach, performance depends significantly on the choice of a regularization parameter corresponding to a real-world scenario [13]; some require multichannel speech signals [13]; and most assume that either the source signal is a modulated Gaussian pulse [17,18], or that the speaker and microphone characteristics are known [11,12]. For far-field ASR tasks, however, we are required to estimate RIRs from reverberant speech source signals independent of speaker and microphone characteristics. ...
... environments has been extensive, employing methods from signal processing [13,14,15] to deep learning [7,8,9,16,17]. Previously, room acoustic parameters, such as T60 and direct-to-reverb ratio (DRR), were more commonly estimated, but recent advances in neural networks allowed direct RIR estimation in the time domain [8,9]. ...
Preprint
Full-text available
In real-world acoustic scenarios, there often are multiple sound sources present in a room. These sources are situated in various locations and produce sounds that reach the listener from multiple directions. The presence of multiple sources in a room creates new challenges in estimating the room impulse response (RIR) as each source has a unique RIR, dependent on its location and orientation. Therefore, issues of determining which RIR should be predicted and how to predict it arise, when the input signal is a mixture of multiple reverberated sources. To address these, we propose a new task of predicting a "representative" RIR for a room in a multiple source environment and present a training method to achieve this goal. In contrast to the model trained in a single source environment, our method shows robust performance, regardless of the number of sources in the environment.
... Echo path changes were incorporated by instructing the users to move their device around or bring themselves to move around the device. The RT60 distribution for 4387 desktop environments in the real dataset for which impulse response measurements were available is estimated using a method by Karjalainen et al. [14] and shown in Figure 1. For 1251 mobile environments the RT60 distribution shown was estimated blindly from speech recordings [15]. ...
Article
Full-text available
The ICASSP 2023 Acoustic Echo Cancellation Challenge is intended to stimulate research in acoustic echo cancellation (AEC), which is an important area of speech enhancement and is still a top issue in audio communication. This is the fourth AEC challenge and it is enhanced by adding a second track for personalized acoustic echo cancellation, reducing the algorithmic latency to 20ms, and including a full-band version of AECMOS. We open source two large datasets to train AEC models under both single talk and double talk scenarios. These datasets consist of recordings from more than 10,000 real audio devices and human speakers in real environments, as well as a synthetic dataset. We open source an online subjective test framework and provide an online objective metric service for researchers to quickly test their results. The winners of this challenge were selected based on the average Mean Opinion Score (MOS) achieved across all scenarios and the word accuracy rate.
... Several algorithms have been proposed to blindly estimate an RIR from a reverberant source signal using traditional signal-processing approaches [11-13, 17, 18]. For some methods that take an ℓ 1 -norm-based approach, performance depends significantly on the choice of a regularization parameter corresponding to a real-world scenario [13]; some require multichannel speech signals [13]; and most assume that either the source signal is a modulated Gaussian pulse [17,18], or that the speaker and microphone characteristics are known [11,12]. For far-field ASR tasks, however, we are required to estimate RIRs from reverberant speech source signals independent of speaker and microphone characteristics. ...
Preprint
Full-text available
We propose to characterize and improve the performance of blind room impulse response (RIR) estimation systems in the context of a downstream application scenario, far-field automatic speech recognition (ASR). We first draw the connection between improved RIR estimation and improved ASR performance, as a means of evaluating neural RIR estimators. We then propose a GAN-based architecture that encodes RIR features from reverberant speech and constructs an RIR from the encoded features, and uses a novel energy decay relief loss to optimize for capturing energy-based properties of the input reverberant speech. We show that our model outperforms the state-of-the-art baselines on acoustic benchmarks (by 72% on the energy decay relief and 22% on an early-reflection energy metric), as well as in an ASR evaluation task (by 6.9% in word error rate).
... Various algorithms exist for estimating the model parameters T i , A i , and A noise from noisy RIR measurements [22,32,34]. In the denoising approach, a recently proposed neural network architecture is used, which was shown to be robust, computationally efficient, and deterministic (as opposed to iterative), [22]. ...
Article
Full-text available
Spatial room impulse responses (SRIRs) capture room acoustics with directional informa- tion. SRIRs measured in coupled rooms and spaces with non-uniform absorption distribution may exhibit anisotropic reverberation decays and multiple decay slopes. However, noisy mea- surements with low signal-to-noise ratios pose issues in analysis and reproduction in practice. This paper presents a method for resynthesis of the late decay of anisotropic SRIRs, effec- tively removing noise from SRIR measurements. The method accounts for both multi-slope decays and directional reverberation. A spherical filter bank extracts directionally constrained signals from Ambisonic input, which are then analyzed and parameterized in terms of multiple exponential decays and a noise floor. The noisy late reverberation is then resynthesized from the estimated parameters using modal synthesis, and the restored SRIR is reconstructed as Ambisonic signals. The method is evaluated both numerically and perceptually, which shows that SRIRs can be denoised with minimal error as long as parts of the decay slope are above the noise level, with signal-to-noise ratios as low as 40 dB in the presented experiment. The method can be used to increase the perceived spatial audio quality of noise-impaired SRIRs.
... Since rooms have thousands of modes, it is common-practice to estimate them on a band-by-band basis [2,3]. Mode frequencies of carillon bells are estimated similarly in [4], and the decay rates are found with non-linear optimization [5]. Frequency-zoomed ARMA modeling on filtered groups of resonant frequencies has been used to model noisy string instruments and room responses in [6]. ...
Preprint
Full-text available
Linear systems such as room acoustics and string oscillations may be modeled as the sum of mode responses, each characterized by a frequency, damping and amplitude. Here, we consider finding the mode parameters from impulse response measurements, and estimate the mode frequencies and decay rates as the generalized eigenvalues of Hankel matrices of system response samples, similar to ESPRIT. For greater resolution at low frequencies, such as desired in room acoustics and musical instrument modeling, the estimation is done on a warped frequency axis. The approach has the benefit of selecting the number of modes to achieve a desired fidelity to the measured impulse response. An optimization to further refine the frequency and damping parameters is presented. The method is used to model coupled piano strings and room impulse responses, with its performance comparing favorably to FZ-ARMA.
... The near end single talk speech quality is given in Figure 1. The RT60 distribution for 2678 environments in the real dataset for which impulse response measurements were available is estimated using a method by Karjalainen et al. [13] and shown in Figure 2. The RT60 estimates can be used to sample the dataset for training. ...
Preprint
Full-text available
The ICASSP 2022 Acoustic Echo Cancellation Challenge is intended to stimulate research in acoustic echo cancellation (AEC), which is an important area of speech enhancement and still a top issue in audio communication. This is the third AEC challenge and it is enhanced by including mobile scenarios, adding speech recognition rate in the challenge goal metrics, and making the default sample rate 48 kHz. In this challenge, we open source two large datasets to train AEC models under both single talk and double talk scenarios. These datasets consist of recordings from more than 7,500 real audio devices and human speakers in real environments, as well as a synthetic dataset. We also open source an online subjective test framework and provide an online objective metric service for researchers to quickly test their results. The winners of this challenge are selected based on the average Mean Opinion Score achieved across all different single talk and double talk scenarios , and the speech recognition word acceptance rate.
... Since the EDC and EDR formulas integrate the noise energy contained in an IR, further considerations are necessary to interpret the data beyond the noise floor. In [36], different strategies to minimize the impact of noise on EDC analysis are explored. The decay rates in the beginning of the IR for different frequencies [37] and directions [38] may also be used to replace the noisy part of SIR with decaying noise, which is beneficial in sound reproduction using captured IRs. ...
Thesis
Full-text available
Available online with the related articles at: http://urn.fi/URN:ISBN:978-952-64-0472-1 In this dissertation, the reproduction of reverberant sound fields containing directional characteristics is investigated. A complete framework for the objective and subjective analysis of directional reverberation is introduced, along with reverberation methods capable of producing frequency- and direction-dependent decay properties. Novel uses of velvet noise are also proposed for the decorrelation of audio signals as well as artificial reverberation. The methods detailed in this dissertation offer the means for the auralization of reverberant sound fields in real-time, with applications in the context of Immersive sound reproduction such as virtual and augmented reality.
... The near end single talk speech quality is given in Figure 1. The RT60 distribution for 2678 environments in the real dataset for which impulse response measurements were available is estimated using a method by Karjalainen et al. [12] and shown in Figure 2. The RT60 estimates can be used to sample the dataset for training. ...
... where T60 is the reverberation time (defined as the period of time required for the sound-pressure to decay 60 dB, which in this work is estimated by Karjalainen's algorithm [7]), σ 2 r is the room spectral variance (RSV) [8], and R is the direct-toreverberant energy ratio (DRR) [9], and γ = 0.3. In practice, a higher T60 indicates a more lasting reverberation effect, the RSV is closely related to the coloration effect, and the DRR provides some insight on the source-microphone relative position. ...
... Consequently, a correction term was added to the integration to prevent the truncation error [9,13,14]. For mathematical advantages, nonlinear regression methods [15] were investigated to fit the RIR to calculate the slope of the EDC. The technique was further developed as an automated detection method for calculating the correction term and determining the truncation time [5,8]. ...
Article
Full-text available
The generalized spectral subtraction algorithm (GBSS), which has extraordinary ability in background noise reduction, is historically one of the first approaches used for speech enhancement and dereverberation. However, the algorithm has not been applied to de-noise the room impulse response (RIR) to extend the reverberation decay range. The application of the GBSS algorithm in this study is stated as an optimization problem, that is, subtracting the noise level from the RIR while maintaining the signal quality. The optimization process conducted in the measurements of the RIRs with artificial noise and natural ambient noise aims to determine the optimal sets of factors to achieve the best noise reduction results regarding the largest dynamic range improvement. The optimal factors are set variables determined by the estimated SNRs of the RIRs filtered in the octave band. The acoustic parameters, the reverberation time (RT), and early decay time (EDT), and the dynamic range improvement of the energy decay curve were used as control measures and evaluation criteria to ensure the reliability of the algorithm. The de-noising results were compared with noise compensation methods. With the achieved optimal factors, the GBSS contributes to a significant effect in terms of dynamic range improvement and decreases the estimation errors in the RTs caused by noise levels.
... Echo path change was incorporated by instructing the users to move their device around or bring themselves to move around the device. The near end single talk speech quality is given in Figure 2. The RT60 distribution for 2678 environments in the real dataset for which impulse response measurements were available is estimated using a method by Karjalainen et al. [12] and shown in Figure 3. The RT60 estimates can be used to sample the dataset for training. ...
Article
Full-text available
The INTERSPEECH 2021 Acoustic Echo Cancellation Challenge is intended to stimulate research in the area of acoustic echo cancellation (AEC), which is an important part of speech enhancement and still a top issue in audio communication. Many recent AEC studies report good performance on synthetic datasets where the training and testing data may come from the same underlying distribution. However , AEC performance often degrades significantly on real recordings. Also, most of the conventional objective metrics such as echo return loss enhancement and perceptual evaluation of speech quality do not correlate well with subjective speech quality tests in the presence of background noise and reverberation found in realistic environments. In this challenge, we open source two large datasets to train AEC models under both single talk and double talk scenarios. These datasets consist of recordings from more than 5,000 real audio devices and human speakers in real environments, as well as a synthetic dataset. We also open source an online subjective test framework and provide an online objective metric service for researchers to quickly test their results. The winners of this challenge are selected based on the average Mean Opinion Score achieved across all different single talk and double talk scenarios.
Article
Accurate boundary condition descriptions are essential for creating accurate computational models for room acoustics. These descriptions are often defined in terms of absorption coefficients or surface impedances. However, obtaining complex-valued, frequency-dependent values for an acoustically absorbing material sample at low frequencies is a challenging task. In fact, the current measurement standard ISO 354:2003 does not provide a method for measuring absorption coefficients below 100 Hz. In this paper, a recently proposed eigenvalue-based inverse method for estimating locally-reacting impedance at measurement system resonance frequencies is validated. The validation uses data from three samples measured in an impedance tube. The method is also used to estimate complex-valued, frequency-dependent impedances at frequencies below 100 Hz in a measured reverberant room and in a simulated reverberation chamber.
Article
Full-text available
An established model for sound energy decay functions (EDFs) is the superposition of multiple exponentials and a noise term. This work proposes a neural-network-based approach for estimating the model parameters from EDFs. The network is trained on synthetic EDFs and evaluated on two large datasets of over 20 000 EDF measurements conducted in various acoustic environments. The evaluation shows that the proposed neural network architecture robustly estimates the model parameters from large datasets of measured EDFs while being lightweight and computationally efficient. An implementation of the proposed neural network is publicly available.
Article
Recently, deep learning-based methods for blind reverberation time estimation have been proposed, and outperform those based on conventional signal processing. The signal processing approaches extract the reverberant environmental features of sound by statistical analysis, while deep learning approaches train a network to capture the relationship between acoustic features and the reverberation time. In this letter, we propose a method for blind reverberation time estimation that explicitly reflects physical properties of reverberation by combining the deep learning approach, attentive pooling, and statistical characteristics of reverberant speech obtained using a signal processing method, i.e., spectral decay rates (SDRs). The results obtained with the proposed blind reverberation time estimation method are superior to the previously published state-of-the-art results for the EVAL dataset of the ACE Challenge. This work can be considered a good example of the collaboration between signal processing expertise and deep learning approach.
Article
Full-text available
The knowledge of frequency-dependent spatiotemporal features of the reflected soundfield is essential in optimizing the perception quality of spatial audio applications. For this purpose, we need a reliable room acoustic analyzer that can conceive the spatial variations in a decaying reflected soundfield according to the frequency-dependent surface properties and source directivity. This paper introduces a time-frequency-dependent angular reflection power distribution model represented by a von Mises-Fisher (vMF) mixture function to facilitate manifold analysis of a reverberant soundfield. The proposed approach utilizes the spatial correlation of higher-order eigenbeams to deduce the directional reflection power vectors, which are then synthesized into a vMF mixture model. The experimental study demonstrates the directional power variations of early reflections and late reverberations across different frequencies. This work also introduces a measure called the directivity time-span to quantify the duration of anisotropic reflections before it decays into a totally diffused field. We validate the subband performance by comparing it with the eigenbeam multiple signal classification method. The results prove the influence of source position, source directivity, and room environment in the distribution of reflection power, whereas the directivity time-span behaves independent of the source positions.
Article
The reverberation time is one of the most important parameters used to characterize the acoustic property of an enclosure. In real-world scenarios, it is much more convenient to estimate the reverberation time blindly from recorded speech compared to the traditional acoustic measurement techniques using professional measurement instruments. However, the recorded speech is often corrupted by noise, which has a detrimental effect on the estimation accuracy of the reverberation time. To address this issue, this paper proposes a two-stage blind reverberation time estimation method based on noise-aware time-frequency masking. This proposed method has a good ability to distinguish the reverberation tails from the noise, thus improving the estimation accuracy of reverberation time in noisy scenarios. The simulated and real-world acoustic experimental results show that the proposed method significantly outperforms other methods in challenging scenarios.
Article
Full-text available
Early-to-late energy ratios are used to predict speech intelligibility. The computation is based on the measured impulse response for a source-microphone combination. When measurement is corrupted by extraneous noise, the upper limit of integration cannot be the total time of acquisition, but a new time limit value defined as the useful length of the impulse response. This limit is validated with the reverberation time computed from the Schroeder backward integral. Results are shown for three highly reverberant rooms.
Article
Full-text available
For the measurement of reverberation time, a method of directly regressing the envelope of a squared impulse response (direct method) was examined using 100 data obtained in 14 auditoria. The results were compared with those obtained by the integrated impulse response method (Schroeder method) and a very high coincidence was found between them. It has also found that the direct method is more robust to background noise than the Schroeder method and in the former method relatively short‐time data will do compared with the latter method.
Article
Full-text available
Discrete-time analysis and modeling of reverberant and resonating systems has many applications in audio and acoustics. The methodology of ARMA modeling by pole–zero filters for measured impulse responses was investigated. In addition to an overview of the standard AR and ARMA techniques, a spectral zooming technique is proposed, which is useful for resolving very closely positioned modes and high-density modal clusters. Application cases related to the analysis and modeling of room responses, loudspeaker–room equalization, and the estimation of parameters for musical instrument modeling are studied.
Article
A method of measuring linear-system responses (such as room responses) is presented using ″maximum-length″ pseudorandom noise as the test signal. In this manner, high signal-to-noise ratios can be achieved, even for measurements in noisy environments and for low-power test signals. Pseudo random noise has also been used successfully as the test signal in the ″integrated-impulse″ method of measuring sound decay and reverberation time. This eliminates the need to radiate a short pulse of high peak energy for impulse type measurements. Improvements in signal-to-noise ratios are equal to the period length of the pseudorandom noise, typically 40 dB in room acoustical applications. The necessary digital processing to realize these gains in signal-to-noise ratio and accuracy of response can be performed on available minicomputers.
Article
An efficient approach for real-time synthesis of plucked string instruments using physical modeling and DSP techniques is presented. Results of model-based resynthesis are illustrated to demonstrate that high-quality synthetic sounds of several string instruments can be generated using the proposed modeling principles. Real-time implementation using a signal processor is described, and several aspects of controlling physical models of plucked string instruments are studied.
Article
Backward integration of a room impulse response has long been used in room acoustics for the estimation of reverberation time. However, the inevitable noise floor limits the minimum level of the measured impulse response, thereby leading to errors. This paper shows how the error is influenced by the selection of truncation time and evaluation range. A general guideline that emerges from this study is to truncate the measured impulse response at the knee where the main decay slope intersects the noise floor, then measure the slope of the backward integrated truncated impulse response down to a level about 5 dB above the noise floor. (C) 1997 Acoustical Society of America.
Article
The admittance of the piano bridge has a crucial effect on piano tone by coupling together the strings belonging to one note into a single dynamical system. In this paper, we first develop theoretical expressions that show how the rate of energy transmission to the bridge as a function of time (including the phenomena of beats and ’’aftersound’’) depends on bridge admittance, hammr irregularities, and the exact state in which the piano is tuned. We then present experimental data showing the effects of mutual string coupling on beats and aftersound, as well as the great importance of the two polarizations of the string motion. The function of the u n a c o r d a pedal in controlling the aftersound is explained, and the stylistic possibilities of a split damper are pointed out. The way in which an excellent tuner can use fine tuning of the unisons to make the aftersound more uniform is discussed.
Article
The influence of several sources of error on room acoustical measurements have been investigated using maximum-length sequences (MLS). The algorithms for the determination of room acoustical parameters used by different analyzers introduce systematic differences caused by differences in time windowing and filtering, in reverse-time integration and in noise compensation. The overall uncertainty of measurements is of the same magnitude or a little higher than subjectively perceivable changes in room acoustical parameters when the measurements are performed according to ISO/DIS 3382. However, the draft standard allows various procedures to be applied in the processing of impulse responses.
Article
Reverberation decay curves can be obtained by the backward integration of room impulse responses proposed by M. R. Schroeder. The evaluation of reverberation times is often achieved by a linear regression line fitting the reverberation decay curves. However, under noisy conditions, the successful application of this method requires either a careful choice of the integration limit or a precise estimate of the mean-square value of the background noise. In the present paper, an alternative method using a nonlinear iterative regression approach for evaluating reverberation times from Schroeder's decay curves is proposed. A nonlinear model of the decay curves, established according to the nature of Schroeder's decay curves, is used for the regression process rather than the linear model used in linear regression. The regression process is based on the generalized least-squares error principle in which a rapid convergence can be observed. Preliminary experiments show a slight dependence of the reverberation time T30dB on both the integration limit and the background noise. Using this approach, a high precision of reverberation time evaluations can be achieved without either careful selection of the integration limit or precise estimation of the mean square value of the background noise.
Article
A new method of measuring reverberation time is described. The method uses tone bursts (or filtered pistol shots) to excite the enclosure. A simple integral over the tone-burst response of the enclosure yields, in a single measurement, the ensemble average of the decay curves that would be obtained with bandpass-filtered noise as an excitation signal. The smooth decay curves resulting from the new method improve the accuracy of reverberation-time measurements and facilitate the detection of nonexponential decays.
Article
Utilizing a digital acquisition system and minicomputer, two promising techniques for accurate determination of reverberation times have been studied in detail from the viewpoint of standard reverberation room tests. The first is Schroeder's "integrated impulse method," and special attention was given to the question of repeatability and the influence of signal-to-noise ratio on the successful application of the method. The second technique involves taking an ensemble average of a large number of logarithmic decay curves. It was found that, even for nonuniform decays, the average decay curves obtained by the second method compared well with those determined by the Integrated Impulse Method. A l'aide d'un réseau numérique de saisie des données et d'un mini-ordinateur, on a étudié à fond deux méthodes pour la détermination des périodes de réverbération du point de vue des essais effectués dans des salles de réverbération. La première méthode est la "méthode d'impulsions intégrées" de Schroeder; on a prêté une attention particulière au caractère répétitif et à l'influence du rapport signal-bruit sur le succès de la mise en application de la méthode. La deuxième méthode inclut le calcul de la moyenne d'un grand nombre de courbes de décomposition logarithmiques. On a remarqué que, même dans le cas de décompositions non uniformes, les courbes moyennes de décomposition obtenues selon la deuxième méthode correspondent de près aux courbes obtenues par la méthode d'impulsions intégrées. RES
Acoustics--Measurement of the Reverberation Time of Rooms with Reference to Other Acoustical Parameters
ISO 3382-1997, "Acoustics--Measurement of the Reverberation Time of Rooms with Reference to Other Acoustical Parameters," International Standards Organization, Geneva, Switzerland (1997), 21 pp.
  • W C Sabine
W. C. Sabine, Architectural Acoustics (1900; reprinted by Dover, New York, 1964).
Estimation of Impulse Response Length to Compute Room Acoustical Criteria
  • L Faiget
  • R Ruiz
  • C Legros
L. Faiget, R. Ruiz, and C. Legros, "Estimation of Impulse Response Length to Compute Room Acoustical Criteria," Acustica, vol. 82, suppl. 1, p. S148 (1996 Sept.).