Content uploaded by John Mourjopoulos
Author content
All content in this area was uploaded by John Mourjopoulos on Jan 13, 2017
Content may be subject to copyright.
PAPERS
Digital Equalizationof RoomAcoustics*
JOHN N. MOURJOPOULOS, AES Member
Wire Communications Laboratory, Electrical Engineering Department, University of Patras, 26110 Patras, Greece
Signal-processing methods such as digital equalization can in theory achieve a reduction
in acoustic reverberation. In practice, however, the realization of these methods is only
partially successful for a number of objective and subjective (perceptual) reasons. Two of
these problems, the dependence of the equalizer performance on the source and receiver
positions and the requirement for extremely lengthy filters, are addressed. It is proposed that
all-pole modeling of room responses can relax the equalizer filter length requirement, and
the use of vector quantization can optimally classify such responses, obtained at different
source and receiver positions. Such classification can be used as a spatial equalization library,
achieving reduction in reverberation over a wide range of positions within an enclosure, as
was confirmed by a number of tests.
0 INTRODUCTION signal is postfiltered by the equalizer to remove distor-
tions generated by such transmission (Fig. 1). Further-
The use of digital equaliz_ition in room acoustics has more due to the varying requirements in these applica-
been attracting considerable interest mainly because tions, the theoretical constraints dictated by the use of
commercial products have appeared [1], or will appear deconvolution can be relaxed. For example, in applica-
shortly, "correcting" the effects of reverberation on tions 1) it is often sufficient to achieve an approximate
audio signals. This development is mainly a result of correction of the transmission path distortions, possibly
the technological evolution in digital hardware since the by anticipating the gross frequency domain distortions
principles of digital equalization were established at [1], [9]. In the case of applications 4)the effect of
least a decade ago. It is also significant that some prob- reverberation is often anticipated by the modification of
lems related to the application of such methods renlain certain speech parameters instead of directly processing
still largely unsolved, the acoustic signal. Nevertheless, in all cases certain yet
First this paper will present the principles of digital undefined perceptual rather than mathematical criteria
equalization of the acoustics of closed spaces, and then must be employed before such methods can be used in
it will discuss the problems related to the application of a totally robust and efficient manner.
such methods, proposing an approach to their resolution.
Digital equalization can be used in different applica- 1 PRINCIPLES OF RTF EQUALIZATION
tions in acoustics, such as 1) audio system and room
response correction [ 1]- [4], 2) active noise cancellation The analysis of sound transmission in reverberant en-
[5], 3) dereverberation of recorded signals [6], and 4) closures by means of computers, based on system rather
videoconferencing and computer speech recognition than physical models, can be traced back to Schroeder
[7], [8]. andKuttruff[10]. In principlethe combinedeffect of
The theoretical basis of digital equalization is the prin- direct and multipath sound transmission in such spaces
ciple of deconvolution. In applications 1) and 2) the is described by a source-to-receiver impulse response
audio signal is prefiltered by the equalizer in order to function h(t), where tis the time index in seconds [11].
compensate for its subsequent transmission through the Ignoring the effects of source or receiver directionality
audio and acoustics system (amplifier, loudspeakers, and minor perturbations due to environmental influ-
and room), whereas in applications 3) and 4) the audio ences, this factor is uniquely defined for each enclosure,
but for each specific set of source and receiver coordi-
Presented at the92nd Convention of the Audio Engineering nates, and is described by a vector Ihaving as parameters
Society, Vienna, Austria, 1992 March 24-27; revised 1993 these six coordinates. The work presented here will be
July 1, 1994 June 5 and August 3. concerned with the discrete-time representation of this
884 J.AudioEng.Soc.,Vol.42,NO.11,1994November
PAPERS DIGITAL EQUALIZATION OF ROOM ACOUSTICS
continuous-time impulse response function h(nT), where (error) equal to the RTF all-pass component which has a
nis the discrete-time (sample) index and Tthe sampling "fiat" spectrum but also carries most of the time-domain
interval in seconds, which, for simplicity, will be taken reverberant energy. Fig. 2 shows a typical room impulse
as 1. In the frequency domain these effects are described response and its all-pass component which, as can be
by the complex room transfer function (RTF) H(z), ex- observed, has significant reverberant energy. Practically
pressed as a ztransform of the impulse response. In speaking, the effect of the RTF all-pass component is
effect, sound transmission in closed spaces is repre- perceived as reverberation and "smearing" of transients
sented by a linear, time-invariant filter with response in the signal, but without the coloration often associated
H(z). with such distortions. Subsequent work [14]-[16] has
For perfect equalization the distortions imposed by discussed the effect and nature of the all-pass and mini-
this filter have to be removed. This can be achieved by mum-phase components of audio systems as well as
introducing an "inverse filter" (that is, the equalizer), RTFs. Given that the design of such minimum-phase
having response hi(n) and transfer function Hi(z) (Fig. equalizers is simpler than that of the "complete" equal-
1), such that izer [as definedby Eq. (1)], such an approachis often
adopted in many of the applications listed.
h(n) * hi(n) = §(n) In principle the direct inversion of mixed-phase (or
(1)
H(z)·Hi(z) = 1
where _(n) is the unit pulse function and * denotes dis- lo-
crete-timelinearconvolution.
The first comprehensive attempt to design such an _-
equalizer for room acoustics was due to Neely and Allen _ -
[2]. These authors proposed a minimum-phase equal- __ o- _ .......-................... :
izer, that is, one that corrects only minimum-phase
(modulus of RTF) distortions, according to the
decomposition
-I0
H(z)=mmp(Z) . Hap(Z)(2) o o.o68 ' o.136
Time (s )
where Hmp(Z)is the minimum phase (or, more correctly, (al
the equivalent minimum phase [12] of the RTF, and
Hap(z) is its all-pass component. It is by definition [2],
[13] lo
In(z)l= Inmp(Z)l
(3)
Inap)1
(z= 1, for z= eTMand all to. _ o i,, ..... ._.._. ..... ...................
_z
The minimum-phase equalizer derived after this RTF -'
decomposition will have a transfer function Hirap(Z)
suchthat -,o
0 0.068 0.136
IHi.,p(Z)l= IH(z)[-'. (4) (b)
From Eqs. (1)-(4) it is clear that such an equalizer Fig. 2. (al Impulse response function measurement of rectan-
gular room. (b) Its all-pass component. Room dimensions 8
will not achieve perfect time-domain deconvolution [ac- by 4 by 3 m; central source position; source-to-receiver dis-
cording to Eq. (1)], but instead it will leave a residual tance 0.5 m; sampling frequency 15 kHz.
[Source}--4 EqC_tlzaler _1_-[ sAysCt_U_C t 4 Receiver I
(al
'_ l S°urce_-_ Ac°ustic _-I Digital _-_Receiver jSystem I_Equalizer
(b)
Fig. 1. Block diagram for typical digital equalizer applications. (al Prefiltering arrangement. (b) Postfiltering arrangement.
d. Audio Eng. Soc.,Vol. 42, No. 11,1994 November 885
MOURJOPOULOS PAPERS
nonminimum-phase) responses, such as the RTFs, is not According to the preceding discussion, such an inverse
possible since it leads to unstable filter realizations [17]. sequence can be rendered causal and stable after the
However, it is possible to derive and implement an ap- introduction of a delay factor which will move its acausal
proximate and stable inversion of such functions [17]- partinto the causal (positive) time scale. Such an ap-
[20]. This is so because a causal and stable sequence proach was adopted in [18], [19], where a homomorphic
can invert the minimum-phase component of any mixed- decomposition of the room impulse response into mini-
phase function, and an infinite acausal (anticipatory) and mum- and maximum-phase terms was at first undertaken
stable sequence can similarly invert the maximum-phase prior to the evaluation of an approximate and delayed
component of such functions. Assuming that these infi- inverse in the manner described previously. However,
nite inverses can be truncated after some term and thus such an approach was found to be oversensitive to errors
rendered finite, it is also theoretically possible to shift in the initial homomorphic decomposition of the room
the maximum-phase inverse sequence in time so that its response [18], [21 ], and for this, the inversion of mixed-
end coincides with the time origin, that is, being at t= phase responses is usually implemented via least-squares
0. This shifting operation will produce a causal approxi- methods [17], [18] [Fig. 3(b)]. Such an approach allows
mation of this function, and it can be practically realized the estimation of the approximate finite-duration in-
by the introduction of a delay operator which produces verse, and in addition it is also possible to evaluate the
the inverse sequence at a relative (to the input) interval, optimum delay value by using the Simpson sideways
Clearly, this delay should be at least equal to the length recursion routine [18], [20]. Fig. 3(b) shows such an
of the truncated acausal maximum-phase inverse. Such inverse. Apart from the introduction of the delay factor
an approximation will practically approximate the "per- (which is obviously inconsequential in practical real- or
fect" deconvolution [as described by Equation (1)], pro- nonreal-time implementations), such a method is ap-
vided that the acausal sequence decays to low-amplitude proximate only in the sense that the maximum-phase
terms approaching zero (or the measurement or arithme- inverse has been truncated and also finite approximations
tic system noise floor) for the chosen delay value. _of the inverses are considered, something which in prac-
In the Appendix a mathematical presentation of this tice introduces insignificant computational error. In this
analysis is given, augmented by simple examples which sense it is incorrect to suggest [22] that such "complete"
may help the reader to clarify these aspects of the inver- and approximate RTF inversion is not possible, due to
sion of mixed-phase functions, the mixed-phase property of the room response func-
Any mixed-phase sequence, such as a room impulse tions.
response, can be represented mathematically by the con- Preis [23], [24] has stated the principles for digital
volution of a minimum-phase and a maximum-phase equalization in audio systems, and subsequent work by
term [13]. Similarly, the inverse function for such a Berkovitz[3], [25] and others [26] described attemptsto
sequence will be composed of the convolution of the realize such equalizers, but without sufficient theoretical
corresponding minimum- and maximum-phase inverses, analysis of the problem or presenting any results. The
5-
,_ _ 0.5
o_==;
o
0 0.35 0.7 0.35 0.7
Time ( s ) Time ( s )
(a) (c)
equalized RTF
2O
_. 0_,- ...... ,, -.-7 ._. _,,,i .....
1- to.20
original RTF
"-'-o. 0 ,,,,t,,,,a_::_. _....... _'_ _ ...... _ -60 _
-
E
< = -8o- t
5
-1-_ -100
_E
16384 32"168 11000 22000
Number ofSamples Frequency (Hz)
(b) (d)
Fig. 3. Completc equalization of full-bandwidth (sampling ratc44.1 kHz) room response for room ofFig. 2, but at 3-m
source-receiver distance. (a) Impulse response. (b) Inverse filter (n_ = 32,768 samples). Note mixed-phase structure of filter.
(c) Time-domain equalization. (d) Original modulus RTF and equalized version. (For clearer presentation, curves were shifted
on vertical axis).
886 d. Audio Eng. Soc., Vol. 42, No.11,1994 November
PAPERS DIGITAL EQUALIZATION OF ROOM ACOUSTICS
theory and practice of realizing such equalizers, appro- 2 ns for stereo operation) will be required by the digital
priate for mixed-phase systems, were presented in [17], hardware,which is very difficult and expensive to
[18]. As was shown in these references, nearly perfect achieve. In Section 3 a method will be presented for
equalization of RTFs and other audio systems according resolving this problem.
to Eq. (1) can be achieved given that the previously The second problem in practical RTF equalization re-
described approximations and conditions are satisfied, lates to the variation of h(n) with the source and receiver
However, the performance of such "complete" and ap- positions inside the enclosure [4], [9], [42]. This func-
proximate equalizers can be practically limited due to tion (assuming ideal omnidirectional point source and
the problems to be discussed in the following section, receiver) is only specified for a spatial vector I= (sx,
Fig. 4 shows the complete time and frequency equaliza- Sy, s_, rx, ry,rz), which defines three source and three
tionof a room response achieved by this method. In this receiver coordinates inside the enclosure. As was shown
example the impulse response of a rectangular roomof in [4], when hi(n) is designed from a measurement of
approximate dimensions of 8 by 3 by 4 m was measured h(n)obtained at the spatial position vector I, then when
for a central source and receiver location and a distance it is applied for RTF equalization at other positions, say
of0.5 m by using a pseudorandom excitation technique, r, an error will result which, for II-l'J > Dc/2, Dc
For this example the measurement sampling frequency being the room critical distance in meters, will be greater
was 15 kHz, resulting in an approximate bandwidth of than the original distortion due to RTF. In practical terms
5 kHz. For clarity, equalized spectra of up to 3 kHz are this restricts the use of RTF equalization in all the appli-
shown (Figs. 2 and 4). However this approach is by no cations listed in the Introduction since itis often required
means restricted by the bandwidth of the RTF, as is to compensate for or remove RTF distortions in more
also obvious for the equalized 44.1-kHz sampling rate than a single'receiver (listener) posit!on inside the room,
responseofFig. 3. and ideallyfor all possiblepositions. Fig. 6shows the
Numerous other dereverberation techniques have also equalization error due to incorrectly measured RTF (in
been proposed [22], [27]-[30], [41], mainly aimed at terms of the proper spatial receiver position).
reverberant speech enhancement, that is, for the applica- Given that RTFs cannot be practically predicted in
tion areas 3) and 4) described in the Introduction.
34
v 27 _ _W
EQUALIZATION_,
2o
There are mainly two practical problems related to _ ,3
the previously described equalization methods [4], [31].
The first problem is the extremely large inverse filter =*
order [as can also be observed from Fig. 3(b)]. In prac- o ,60 ,2'oo ,8'00 2,_o _ooo
tice h(n) is a finite-length sequence, defined for 0 < n.............
_<n.... where nm_x is the maximum value of the time (a)
index, usually dictated by the dynamic range of the par-
ticular measurement. Practically this can be related to ,o-
actual time (in seconds) via the well-known acoustic _i
parameter of reverberation time RT of the particular _ o Z .
enclosure, which by definition dictates a 60-dB dynamic
range. Then the number of samples required to measure __-
h(n) and also to design hi(n) will be given by __oZ
n_ = RT .f_ (samples) (5) 0.068 0.136
Time (s )
wheref_ is the sampling rate in hertz. Although in theory (b)
the length of the inverse filter for a particular RTF can
49
be larger than the length of the responses, it has been _,
v 39
found that by using the least-squares finite-length ap- _
proximation, a filter order comparable tothe original _ _9
response is usually sufficient to produce satisfactory de-
convolution. As an example, for RT = 1 s andf_ = 10
kHz, 10 x 103samples (coefficients) will be required for _
hi(n), whereas forf_ = 44.1 kHz, 44.1 x 103 samples
(coefficients) will be required. When shorter equalizer o _6o ,:'0o _oo _'oo _ooo
filters are realized, then the performance of the method _q .... " ....
deteriorates [4]. Fig. 5 shows the equalization error due (c)
to the use of an insufficient-length inverse filter. Fig. 4. Complete equalization of impulse response for room of
Fig. 2. (a) Modulus RTF to 3 kHz. (b) Time-domain equalizer
For real-time implementation of such filters [31] a output. (c)Frequency-domain equalizer output. Equalizer filter
multiplication time of approximately 5.5 ns (or nearly length n_ = 4096 samples.
d. Audio Eng.Soc., Vol. 42, No. 11, 1994 November 887
MOURJOPOULOSPAPERS
most cases, this problem is usually solved by taking only the problem of spatial RTF variation but also the
measurements for varying source and receiver configu- variations due to other audio and acoustic channels has
rations. For example, multiple-input least-squares de- been proposed in [34] and is presented in more detail in
convolution methods have been proposed [5], [32] for the following section.
active noise cancellation and audio applications. In a The discussion in this section was concerned with the
similar fashion, microphone arrays have been proposed measurable (or objective) problems associated with RTF
in videoconferencing and computer speech recognition deconvolution. However, there is evidence from psycho-
applications [7], [8]. Often such methods employ multi- acoustics suggesting that such a complex and, as was
pie-input-multiple-band arrangements [22], [27]. shown, only partially successful mathematical procedure
All these approaches have been disadvantage of in- is not the optimal approach for achieving the required
creased complexity and cost due to the requirement for subjective dereverberation results. One has only to think
numerous microphone inputs and associated processing of the adaptation to reverberation effects achieved by
peripherals. It has also been suggested that for various the listener when moving within a room or even between
audio applications this problem can be resolved by tak- different enclosures to see that dramatic physical varia-
ing RTF (modulus) averages for different source and tions in acoustics are easily anticipated by the hearing
receiver spatial placements. However, such an approach mechanisms. Similarly, the auditory mechanism, mainly
is prone to erroneous RTF estimation, and at best results by utilizing its binaural configuration, seems able to
in compromised equalizer performance. Very recently -recover useful acoustic information even at very low
[33] a promising approach based on all-pole RTF model- direct-to-reverberant-signal ratios, that is, when the pre-
ing has been proposed, which utilizes the common poles viously described single-channel deconvolution ap-
existing in such functions independent of die source and proach would require excessive inverse filter lengths.
receiver positions. An optimum approach to resolve not However, the utilization of such perceptual results,
il
-10 -0
0 ' 0. '136 0. 272 00. 068 0.136
Time (s) Time (s)
(a) (a)
_ _ 35
_ _ o .........
"_ o,_ o _oo t2oo taoo 2400 aooo
0600 12'00 18'00 24'00 _000 Fr equencu<HZ)
F...... _ <.z>(b)
(h)
-1O 0 0. 0680.136
0 _ _ 0.1'36 ' 0.272 Time (s)
T_e c_)(c)
(c) _,,
0,,,,,,,,, _. 600 1200 1800 2"100 3000
0600 1200 18002400 3000 Fr equencu (Hz)
r........ <.., (d)
(d) Fig.6. Effectof equalizerspatialmismatchonitsperformance.
Fig. 5. Effect of equalizer filter length on performance. Results Results should be compared with perfectly matched results of
should be compared with those of Fig. 4. (a) Time-domain Fig. 4. (a) Time-domain.equalizer, output for [./-I'1.,=_ 1.5
equalizer output for n_ = 2048. (b) Frequency-domain equal- m. (b) Frequency-domain equahzer output for I/, -II - 1.5
izer output for n_ = 2048. (c) Time-domain equalizer output m. (c) Time-domain equalizer output for ]I-,! I = 2 m. (d)
for nS = 512. (d) Frequency-domain equalizer output for n_ Frequency-domain equalizer output for [1-II = 2 m. In all
= 512. cases ns = 4096 samples.
888 J. AudioEng.Soc., Vol. 42, No. tl,1994November
PAPERS DIGITAL EQUALIZATION OF ROOM ACOUSTICS
which are not yet fully supported by experimental evi- more acceptable from a perceptual point of view [35]
dence or detailed models, remains outside the scope of than the reverse situation, although obviously still less
the presentpaper, desirablethan completeRTF correction.
3) The ARTF was found [9], [33] tobe less sensitive
3 RTF EQUALIZATION BY VECTOR to source-receiver position changes than the original
QUANTIZATIONRTF. However, more work has to be undertaken in order
to evaluate such effects completely.
3.1 Ali-Pole RTF Model Given that ARTF modeling achieves a required reduc-
An all-pole RTF modeling technique can achieve the tion in the equalizer order (at some expense of its per-
necessary reduction in the RTF and the resulting equal- formance), this method is used as the first stage of RTF
izer order, as was proposed in [9]. According to this processing sothat, at a second stage the vector quantiza-
approach, the all-pole RTF approximation (ARTF) Hp(z) tion technique described in the following section can be
is defined as used. This technique helps with the problem of equalizer
sensitivity tosource-receiver position changes.
G(6) 3.2 Vector Quantization of RTF
Hr(z)-1 + _ d/g -k Vector quantization (VQ) is a technique that achieves
k=l optimal classification of large sets of vectors into a
where Gis an arbitrary gain constant (often set to 1), smaller number of groups having nevertheless represent-
dkare the all-pole model coefficients, and Pis the model ative properties [36]. In the case of room acoustics it
order. In practice such a model succeeds in representing was proposed [34] that, due to different source-receiver
RTF resonances well, but it is much less successful at configurations inside the enclosure, the extremely large
modeling RTF dips. Typically for P= 200a good ap- set of possible RTFs will be grouped together and repre-
proximation to an RTF can be achieved, but ARTF reso- sented or equalized by a smaller number of equalizers.
lution can be changed (by changing the value of P) in Such an approach differs from the empirical and error
order toaccommodate a particular value of RT or spe- prone method of RTF spatial averaging since the VQ
cific processing requirements. By the requirement that approach will generate optimal RTF averages, and its
all zeros be on the origin, such a model will give a accuracy will be limited only by the number of groups
minimum-phase approximation to the RTF. Fig. 7 shows chosen. By using the VQ method a three-dimensional
the ARTF model for a particular measured RTF. codebook of RTFs can be estimated for each enclosure,
An RTF equalizer can be designed from Hp(z) having which can be used for equalizer design. The application
the advantage of greatly reduced order. Its impulse re- of this method, which is already used extensively in
sponse is pattern and speech recognition applications [36], [37],
can also be extended to other areas of audio_signal pro-
[J_H-_ _ cessing and room acoustics, since it can be employed
hip(n) = Z-I 1 (7) as an optimal classifier of audio system responses.
The input vectors to the VQ are obtained from mea-
sured ARTFs taken for different source-receiver con-
where Z-_ { } represents the inverse ztransform of the figurations (given by 1). Such vectors can be used as
function. There are a number of advantages for using inputs to the next stage, that is, the design of the VQ
such an approach toRTF equalization, codebook. However, as was found in many speech anal-
1) The order of the inverse filter (equalizer) can be ysis applications, all-pole coefficients dk can alsobe
reduced by up toapproximately 50 times (for a 10,000- transformed intocepstral coefficients xk prior toVQ.
point response) sothat its real-time implementation can The cepstral coefficients are defined following the
be achieved easily using current DSP microprocessor expansion of the (minimum-phase)polynomialHo(z)ac-
technology, cording to
2) The ARTF equalizer succeeds in removing RTF
resonances and not RTF dips. This type of correction is In [Hp(z)] = _' Xk Z-k.
(8a)
k=l
-- RTF The cepstral coefficients can also be derived recursively
_ ARTF
',IP
, [38] after differentiation of Eq. (8a) with respect to
o
§zl, yielding
0
xI = -d]
'_ -15 k- 1
:_xk=-dl, - _] xndk-n,2 _< k<P(8b)
i i I T i J I
2500 5000 n= 1
Frequency (Hz)
Fig. 7. RTF of rectangular room (5-kHz bandwidth) and its Xk = --k_l (k)Xndk_n,P<k.
ARTF approximation for P = 200. n= 1
J. Audio Eng. Soc., Vol. 42, No. 11, 1994 November889
MOURJOPOULOS PAPERS
The use of a vector of cepstral instead of all-pole model where mis the mean value of vector x,Mis the number
coefficients was found in this case, as in many other of training vectors (to be explained), and ( )T indicates
speech processing applications, to facilitate the subse- vector transpose. From the foregoing expressions it be-
quent VQ classification, mainly because of the improved comes clear that the VQ method uses RTF averaging
numerical properties of these coefficients. However, during its optimization stage, often referred to as the
such a transformation is by no means a prerequisite to "training" mode, that is, when the algorithm iteratively
the use of VQ methods, and other, perhaps perceptually splits the centers up to the specified maximum value N.
based, transformations may also be proposed and expand At this stage the performance of the VQ classifier can
the scope of the VQ equalization method. Hence after also be assessed, as will be better explained in the fol-
the evaluation of cepstral coefficients, P-dimensional lowing section. Given that appropriate training data have
vectors x= (xI,...,xp) T are estimated, one at each been given to the algorithm, then during its operational
spatial vector I, and are used as inputs of the VQ. The mode (usually referred to as test mode) the quantizer
quantizer will produce optimal mapping of vectors xwill automatically assign any general input vector into
into output vectors _. The latter will belong to an N- the appropriate code wordyi by using the nearest-neigh-
level codebook which includes possible patterns yi, that bor (NN) criterion [based on the distance criterion of
is, the codebook will be Eq. (13)]. The use of the NN search routine will impose
a computational load consisting of 2 x NxPadditions/
A= {y_; i= 1, • .'.,N} . (9) subtractions, 2 × N×Pmultiplications, and Ncompar-
ison operations. Fig. 8 gives the block diagram of the
The optimality of the VQ approach lies in the place- proposed VQ classifier.
ment of the code vectors within the P-dimensional space For equalization the inverse filter corresponding to
by the process of the continuous minimization of the ARTFs (and not cepstral coefficients) was employed,
distortion measure. In effect the quantizer is asked to derived in the way described in [9] and letting z_ be the
separate the input vector space into groups (or centers, inverse filter corresponding toy_. Then, during equalizer
often termed centroids) S_ by use of such a codebook operation it will be possible that the coefficients for zi
accordingto will only be downloadedinto the equalizer when it is
detected that the listener is moving into a location whose
S= {Si;i= 1, • • • , N} . (10) RTFhas been assigned to code wordyi. This option was
not realized here, and as will be discussed in the next
This will achieve the mapping of each input vector into section, the tests were conducted for random receiver
the ith codeword, positions. Fig. 9 gives a block diagram for the operating
mode of the VQ equalizer.
S_ = {x:q(x)=Yi}(11) Clearly this approach will require previous measure-
ments of the RTFs in order to design the enclosure's
where q(x) is the vector quantization operator, codebook. Nevertheless, it may be possible to design
For such an optimal quantizer the expected mapping RTF codebooks for typical room listening environments,
distortion Qi must be minimized in a global sense. This using large sets of off-line measurements. Then these
distortion measure is equal to codebooks may be finetuned and adapted to the specific
listening environmentused for the equalizationapplica-
Qi = _ d(x,y_(v)) (12) tion. However, this approach was not tested during the
x_-Si(v)work reported here and remains a topic for further in-
where d( ) is a distance measure to be defined shortly, vestigation.
i= 1..... Nis the specified number of centers, and
vis the iteration number during the process of distortion
minimization. Inpractice such minimizationcan be 4 PERFORMANCE OF THE VO EQUALIZATION
achieved by using different algorithms [37], from which METHOD
the "k-means with splitting" algorithm (also known as Due to the extremely large number of RTF measure-
LBG) was found to be the most appropriate for the partic- ments required for the evaluation of the performance of
ular application. This algorithm proceeds by splitting this method, the results discussed here refer to simula-
the input vector space into groups by evaluating and tions obtained by using the Allen and Berkley [39]
minimizing the weighted Euclideandistance criterion method based on image techniques. This algorithm as-
d(x, _), defined as sumes ideal omnidirectional point source and receiver,
and simplified reflective properties of the wall bounda-
d(x, ._) = N,/(x-._)TW-l(x- ._) • (13) ries for rectangular enclosures. Such simulations were
carried out for enclosures of different sizes, and for the
Here Wis the positive definite diagonal covariance ma-
source placed either at the room's center or at the comer,
trix of sample vectors, estimated according to at some distance from the walls, as is usual in practical
situations. The position of the receiver was altered, mov-
1 _
W= _ i_=l(x i -m)(x i -m) T (14) ing it in all three dimensions and at distances [l-l'J
•= incrementedby aconstantamount.Thesedistanceswere
890 J. AudioEng.Soc., Vol.42, No. 11, 1994November
PAPERS DIGITALEQUALIZATIONOFROOMACOUSTICS
chosen to be 0. l, 0.2, 0.3, and 0.4 m. Typically 3 x only inverse filters designed from the corresponding cen-
103 different measurements were taken for a medium- ters. The results of such equalization were assessed in
size room simulation. For each measurement the source- the frequency domain by taking
to-receiver impulse response function h(n) was estimated
(that is, the time-domain representation of RTF), from IE(o,)l= Ix'(o,)lIz(o,)l (ts)
which the ARTF coefficients were evaluated for P=
200, as was described in Section 4. From each such set where E(co)is the spectrum of the deconvolution residual
of ARTF coefficients an input vector of cepstral vectors (that is, the error after equalization), X'(m) is the spec-
was generated and was used as input for the VQ during trum of the random input ARTF, and Z(co) is the inverse
its "training" mode. spectrum of the corresponding VQ center. Although
It was found that the VQ performance was satisfactory these results are applicable to ARTFs and not RTFs, as
in the sense that the input vectors were always clustered was found in [9], they are also indicative of equalization
in an optimum way so that the mapping error distance gains obtained for the full RTF, that is, when convolving
between vectors classified toa given center (intradis- Z(co) with H(co). For perfect equalization, IE(co)l should
tance) and to all other vectors (interdistance) had a be completely "flat". The degree of deviation from this
nearly Gaussian distribution with minimum overlap, as ideal condition can be assessed by the variance cr_of
shown in Fig. 10. In the same figure such ARTF classi- this function, defined as
fication performance tests are also shown for central and
corner source positions. As can be observed, better VQ /1P-
performance was achieved for central receiver positions (rE= _/P p--_0(20. loglE(co) [- AV)2dB (16)
than for corner positions, where for distances smaller
than 0.3 m from the walls, centers with singular input where AV is the mean value of this spectrum,
classification (empty cells) were often generated.
A visualization of such an RTF codebook generated 1 P-
for an ideal rectangular enclosure and for central source AV = _ (20loglE(%)l) dB. (17)
position is shown in Figs. 11 and 12. As can be observed, =
p=o
for any possible receiver position a particular code word
(RTF) is specified by the VQ. Areas belonging tothe [recohter I I ,_o,Isame center are shown with the same gray-scale tone. position I
Thedarkerthistone, thehigheris thenumberof input / l
vectorsclassifiedtotheparticularcenter. ] _ ·
RTF digital
During the test mode of VQ operation, random re- RTF eodebook -
ceiver positions were chosen in order to assess the ability codebook I search equalizer
of the VQ toclassify such an input vector xto the most l
appropriate center. In particular it was measured how
well an equalizer would perform when it is designed by equalia_d
such a classification. Ideally such aVQ equalizer should audioout
always remove RTF distortions from inputs by using Fig. 9. Block diagram for application of VQ equalizer.
Room ResPonse
"" RTF
All- Pole Model
ARTF
w
I Oopstral Cocqf.
Estimation
Training mode ITest mode
k - means I iJ RTl: I iJ Nearest Neighboring
VQ Algoritl_ _ Codebook _ C assf cation
VQ Pedormance Equalizer
Evauaton Design
Equalizer I
Performance
Evaluation
Fig. 8. Block diagram of VQ algorithm.
J. AudioEng. Soc., Vol. 42,No. 11,1994 November891
MOURJOPOULOS PAPERS
The value of measured variance was compared to that can be achieved, with a reduction in the average reso-
of the original random input cr2, and these results are nancy width, which may also be perceptually benefi-
plotted in Fig. 13 as functions of the number of VQ cial [35].
centers. These results indicate that for a typical enclo- Although these results are encouraging, indicating
sure (having dimensions of 8 by 4 by 3 m) the ARTF that a global equalization codebook can be designed for
variance decreased after equalization by up to 6.5 dB an enclosure, more work must be undertaken in order
for the central source position and by up to 4.5 dB to develop the proposed technique further. An area of
for the corner source position. Similar results were also possible improvement is the restriction of the source
observed for other enclosures, as shown in Table 1, and receiver positions used during the codebook design
and, significantly, an improvement was observed for stage. For example, it may be advantageous to employ
all random inputs (that is, random receiver positions). RTF measurements (for subsequent codebook classifi-
Typical VQ RTF equalization results are shown in Fig. cation) across a restricted area in the room, correspond-
14, indicating that a "flattening" of the overall spectrum ing to typical listening positions, that is, at a certain
height from the floor and at some distance from the
sidewalls. This may reduce the initial variance in the
data and, provided that during the test mode the receiver
200 willbe locatedin a similararea, more equalizationim-
p. IntraDistance provement may be achieved by the proposed technique.
_lk Inter Distance However, such a study remains open for a future in-
100 _/IlUlIII_ vestigation.
°AIIIlLILILL IIIllI= ,
rtr__ 5 CONCLUSIONS
Distance d(x,,_) 200
Although the methods achieving complete RTF equal-
(a) ization were knownfor at least a decade, their robust
realization into real-time systems for numerous audio
aDis,_nce and acoustic applications was not possible due to practi-
_200 'ltllllllt[lllllt_]Trn_ cal problems. Here it is shown that when minimum-
_. terDistance phase equalization is required, the combination of all-
_ 1i1[_1I[11] pole RTF modeling and VQ methods can resolve the
100 mainproblemsofsuchequalizationsystemsbyreducing
z ' "l'b__...... their highorderandat the sametime makingthem effec-
.... _ tiveforall possiblesourceandreceiverpositionsinside
0 D_ d(x_ 200the enclosure. In more detail it was found that a VQ
codebook can be designed which will always achieve
(b) equalizationgains (that is, spectral flattening), irrespec-
Fig. 10. VQ classifier performance for rectangular-room tive of the receiver position inside the room.
ARTFs and all possible receiver positions. (a) Central source
position. (b) Corner source position. Room dimensions 8 by Furthermore it was found that the VQ equalization
4 by 3 m; P= 200; N= 25. technique performed better for central than comer source
positions inside the enclosure, as is also expected from
Fig. 11. Three-dimensional visualization results for spatial
RTF VQ, corresponding to ideal cubic room (dimension 3 m)
with central source position and representation of source-to- Fig. 12. VQ results for room of Fig. 10. Areas with similar
receiver response measurements resulting in moving from po- gray-scale tone indicate input classification to a particular VQ
sition 1to position 2. For tests, receiver was moved by incre- center. The darker the tone, the higher the number of classified
ments of 0.1 m in all three directions. RTFs. N= 256; P= 200.
892 J.Audio Eng. Soc.,Vol.42, No.11,1994 November
PAPERS DIGITALEQUALIZATIONOF ROOM ACOUSTICS
acoustic theory. This may be due to the fact that corner Proc. 8th AES Int.Conf. (1992), pp.245-256.
and near-the-wall source positions excite a larger num- [2] S. T. Neely and J. B. Allen, "Invertibility of a
ber of modes, resulting in relatively atypical RTFs when Room Impulse Response," J.Acoust.Soc.Am., vol.
compared tothose obtained for other source positions 66, pp. 165-169 (1979).
inside the room [40]. [3] R. Berkovitz, "Digital Equalization of Audio Sig-
Typically the use of the RTF approximation can re- nals," in Collected Papers on Digital Audio (AudioEn-
duce the equalizer order by up to50 times, whereas the gineering Society, New York, 1983), pp. 226-238.
use of VQ clustering can alsoachieve a similar reduction [4] J. Mourjopoulos, "On the Variation and Inverti-
over the number of all required spatial RTF measure- bility of Room Impulse Response Functions," J.Sound
merits. Therefore a large reduction in the complexity Vib., vol. 102, pp. 217-228 (1985).
of a practical RTF equalizer can be achieved with a [5] S. J. Elliot and P. A. Nelson, "Multiple-Point
combinationof these twotechniques, provided that a Equalization in a Room Using Adaptive Digital Filters,"
careful initial design of the VQ codebook has been de- J.Audio Eng.Soc., vol. 37, pp. 899-907 (1989 Nov.).
rived and the receiver position inside the room can be [6] P.J. Bloom and G. P. Cain, "Speech Dereverber-
specified, ation: Performance of Signal Processing Algorithms and
Their Effect on Intelligibility," in Proc.Inst.of Acous-
6 ACKNOWLEDGMENT tics Spring Conf. (1981), pp. 22-24.
The author acknowledges the contributions of A. Tso-
panoglou, J. Georgiou, A. Kalambouki, M. Paraskevas, Table 1. Properties of tested rectangular enclosures and ARTF
variance improvement after VQ equalization.*
D. Tsoukalas, and G. Karachalios to this work. He also
wants tothank ICON Computer Graphics and Anima- Room ARTF Variance Improvement (dB)
tion, Thessaloniki, Greece, for the three-dimensional Dimensions Center Source Corner Source
visualization of the VQ results. (m) RT (s) Position Position
4x 3.5x 3 0.37 16 6
8 x 4 x 3 0.45 6.5 4.5
7REFERENCES 12 x 6 x 3 0.55 5.5 4
[1] R. D. Genereux, "Adaptive Loudspeaker Sys- *In all cases, P= 200 and N= 256.
terns: Correcting for the Acoustic Environment," in
aa 50
Input ARTFu_
g 6 = 3o
.._
.__
_ 2o
C3 4
Output AR'rE 10
41664256 Frequency(Hz)
NumberofCentroidsN (a)
(a)
60
Input ARTF _ 50
10'
°m '_'_ Output ARTF _ 30
·=--'_--....... =_
'_ 6 ....
0
1ooo 2000 3000 4000 5000
416 64 256 Frequency(Hz)
NumberofCentroidsN (b)
(b) Fig. 14. TypicalRTF equalizationresultsachievedby VQ
Fig. 13. Results for VQ equalizer, expressed as frequency- method. Results apply to 4- by 3.5- by 3-m room for which
domain modulus response variance for input and output (equal- a VQ codebook was initially designed. Then a response corres-
ized) RTFs as functions of specified number of VQ centers, ponding toa randomly chosen receiver position was equalized
(a) Central source position. (b) Corner source position. Room by reference tothe VQ codebook. (a) Original RTF. (b) VQ
dimensions 8 by 4 by 3 m. equalizer output. P= 200; N-- 256.
J. Audio Eng. Soc., VoL42, No. 11, 1994 November 893
MOURJOPOULOSPAPERS
[7] J. L. Flanagan, J. D. Johnston, R. Zahn, and [24]C. Bunks and D. Preis,"MinimaxTime-Domain
G. W. Elko, "Computer-Steered Microphone Arrays for Deconvolution for Transversed Filter Equalisers, in
Sound Transduction in Large Rooms," J.Acoust.Soc.Proc.IEEE Int.Conf.on ASSP (1980), pp. 943-946.
Am., vol. 78, pp. 1508-1518 (1985). [25] R. Berkovitz and G. Abbot, "A Loudspeaker
[8] W. Kellermann, "A Self-Steering Digital Micro- Measurement System Based on Signal Processing and
phoneArray,"inProc.IEEElnt.Conf.onASSP(1991), Biophysical Simulation," J.Audio Eng.Soc.(Ab-
pp. 3581-3584. stracts), vol. 28, p. 922 (1980 Dec.), preprint 1711.
[9] J. Mourjopoulos and M. Paraskevas, "Pole-Zero [26] P. Fryer and R. Lee, "Use of Tapped Delay
Modelling of Room Transfer Functions," J.Sound Vib., Lines in Loudspeaker Work," presented at the 65th Con-
vol. 146, pp. 281 - 302 (1991). vention of the Audio Engineering Society, J.Audio Eng.
[10] M. R. Schroeder and K. H. Kuttruff, "On Fre- Soc.(Abstracts), vol. 28, p. 374 (1980 May), preprint
quency Response Curves in Rooms," J.Acoust.Soc. 1588.
Am., vol. 34, pp. 76-80 (1962). [27] H. Wang, and F. Itakura, "An Approach of De-
[11] R. J. Wyber, "The Application of Digital Pro- reverberation Using Multimicrophone Sub-band Enve-
cessing to Acoustic Testing," IEEE Trans.ASSP, vol. lope Estimation," in Proc.IEEE Int. Conf.on ASSP
22,pp.66-72 (1974). (1991),pp.953-956.
[12] J. K. Hammond and J. Mourjopoulos, "Cepstral [28] J. L. Flanagan and R. C. Lummis, "Signal Pro-
Methods Applied to the Analysis of Room Impulse Re- cessing to Reduce Multipath Distortion in Small
sponse Functions," in Proc.In'st.of Acoustics Conf. Rooms," J.Acoust.Soc.Am., vol. 47, pp. 1475-
(1980Nov.),pp.51-54. 1481(1970).
[13] A. V. Oppenheim, and R. W. Schafer, Digital [29] D. Bees, M. Blostein, and P. Kabal, "Reverber-
Signal Processing (Prentice-Hall, Englewood Cliffs, ant Speech Enhancement Using Cepstral Processing,"
NJ, 1975). in Proc.IEEE Int. Conf.onASSP ( 1991), pp. 977- 980.
[14] P. D. Bauman, S. P. Lipshitz, T. C. Scott, and [30] T. Langhans and H. Strube, "Speech Enhance-
J. Vanderkooy, "Cepstral Techniques for Tranducer ment by Non-Linear Multiband Envelope Filtering," in
Measurement," presented at the 76th Convention of Proc.IEEE Int.Conf.on ASSP (1982), pp. 156-159.
the Audio Engineering Society, J.Audio Eng.Soc. . [31] J. Mourjopoulos, "Digital Equalization Meth-
"(Abstracts), vol. 32, p. 1002 (1984 Dec.), preprint ods for Audio Systems," presented at the 84th Conven-
2172. tion of the Audio Engineering Society, J.Audio Eng.
[15] D. Preis, "Phase Distortion and Phase Equal- Soc.(Abstracts), vol. 36, p. 384 (1988 May), pre-
ization in Audio Signal Processing--A Tutorial Re- print 2598.
view," J.Audio Eng.Soc., vol. 30, pp. 774-794 [32] R. Wilson, "Equalization of Loudspeaker
(1982 Nov.). Drive Units ConsideringBoth On- and Off-Axis Re-
[16] J. M. Kates, "Digital Analysis of Loudspeaker sponses," J.Audio Eng.Soc., vol. 39, pp. 127-139
Performance," inProc.IEEEInt.Conf.onASSP (1977), (1991 Mar.).
pp. 377-380. [33] Y. Haneda,S. Makino, andY. Kaneda,"Com-
[17] P. M. Clarkson, J. Mourjopoulos, and J.K. mon Acoustical Pole and Zero Modeling of Room Trans-
Hammond, "Spectral, Phase, and TransientEqualization fer Functions," IEEE Trans.Speech Audio Process.,
for Audio Systems," J.Audio Eng.Soc., vol. 33, pp. vol. 2, pp. 320-328 (1994).
127-132 (1985 Mar.). [34] J. Mourjopoulos, A. Tsopanoglou, and N. Fako-
[18] J. Mourjopoulos, P. M. Clarkson, and J.K. takis, "A Vector Quantization Approach for Room
Hammond, "A Comparative Study of Least-Squares and Transfer Function Classification," in Proc.IEEE Int.
Homomorphic Techniques for the Inversion of Mixed- Conf.on ASSP (1991), pp. 3593-3596.
Phase Signals," in Proc. IEEE Int.Conf.on ASSP [35] F. E. Toole and S. E. Olive, "The Modification
(1982), pp. 1858-1861. of Timbre by Resonances: Perception and Measure-
[19] A. V. Oppenheim, G. E. Kopec, and J.M. ment," J.Audio Eng.Soc., vol. 36, pp. 122-141
Tribolet, "Signal Analysis by Homomorphic Predic- (1988 Mar.).
tion," IEEE Trans.ASSP, vol. 24, pp. 327-332 (1976). [36] R. M. Gray, "Vector Quantization," IEEEASSP
[20] P. M. Clarkson, Optimal and Adaptive Signal Mag., pp. 4-29 (1984 Apr.).
Processing (CRC Press, Boca Raton, FL, 1993). [37] Y. Linde, A. Buso, and R. M. Gray, "An Algo-
[21] J. Mourjopoulos, "The Removal of Reverber- rithm for Vector Quantizer Design," IEEE Trans.Com-
ation from Signals," Ph.D. dissertation, University of mun., vol. 28, pp. 84-85 (1980).
Southamption, UK (1984). [38] B. S. Atal, "Linear Prediction for Speaker Iden-
[22] H. Yamada, H. Wang, and F. Itacura, "Re- tification," J.Acoust.Soc,Am., vol. 55, pp. 1304-
covering of Broad Band Reverberant Speech Signal by 1312 (1974).
Sub-Band Mind Method," in Proc.IEEE Int.Conf.on [39] J. B. Allen and D. A. Berkley, "Image Method
ASSP (1991), pp. 969-972. for Efficiently Simulating Small-Room Acoustics," J.
[23] D. Preis, "Audio Signal Processing with Trans- Acoust.Soc.Am., vol. 66, pp. 943-950 (1979).
versal Filters," in Proc.IEEE Int.Conf.on ASSP [40] H. Kuttruff, Room Acoustics (Applied Science
(1979),pp. 310-313. Publ.,London,1979).
894 J. AudioEng.Soc.,Vol.42, No.11, 1994November
PAPERS DIGITAL EQUALIZATION OFROOM ACOUSTICS
[41] J. B. Allen, D. A. Berkley, and J. Blauert, and the maximum-phase component Hmax(Z)will be
"Multimicrophone Signal Processing Technique to Re-
move Reverberation from Speech Signals," J.Acoust. N
Soc.Am., vol. 62, pp. 912-915(1977). Hmax(Z)=I'I (1 -ai2-1). (22)
[42] P. E. Doak, "Fluctuation of the Sound Pressure i=k+l
Level when the Receiver Position Is Varied," Acustica,A second decomposition of Hmix(Z)may also be obtained
vol. 9, no. 1, pp. 1-9 (1959). if we write
APPENDIX Hmix(Z) =
INVERSIONOF MIXED-PHASE FUNCTIONS
N
A.1 General Description of Functionsk N I-[ (1 - a/z-l)
Let us consider the discrete-timeresponse ofalinear, l-i(1 - aiz-l) II (z-I - al)i=k+l
stable, and causal system described by h(n). The sys- g=li=k+l N
tea's transfer function will be represented by the z- I-[ (z-t-al)
transform g=k+l
(23)
N
I-[ (1 - a/z -1) Then the equivalent minimum-phase component Heq(Z)
H(z) = i=l (18) ofHaix(Z) is
M
1FI(1- 0jz-1) k N
./=l Heq(z)=H(1 -air-1) H (z-l-al) (24)
i= 1 i=k+ 1
It is clear that for stability and causality, Ibl< 1, j=
1, 2..... M. However, the zerosofH(z) can be either and the all-pass component Hap(Z)of Hmix(Z)is
inside,outside, or both with regard to Izl = 1, that is,
N
Jail<l, i=1,2,...,k_-[ (1 -a/z -1 )
i=k+l (25)
Hap(Z)= N
la,I > 1, i=k+1, k+2,..., N. H (z-l_ag)
g=k+l
Then such a general mixed-phase function will be ex-
pressed as If the first decomposition is adopted, then for the imple-
mentation of any inversion with the view of achieving
k N Hmix(z)Hi mix(Z) = I the following must be noted.
1) The minimum-phase part can be directly inverted
H (1 -a/z-1) H (1 - a/z-1) by taking
Hmix(Z)=i= 1 i=k+1(19)
M
1 _ 1 (26)
I'I (1 -b/z -l) Hi mix(Z) - Hmin(Z ) k
i=lH(1 -a/z -l)
In the case of the current study we will be concerned g=l
with finite-length responses, and in this case all poles since this function will be stable (having its poles inside
ofHain(Z) will be at the origin (assuming also that there
are nozeros at the unit circle), Izl = 1).
2) Similar direct inversionofHmax(Z)will result in an
unstable function,
k N
nmix(Z) =H (1 - a/z -l) II (1 -a/z-l) 1 1
i= 1 g=k+ l Hi max(Z)Hmax(Z)N(27)
-----Hain(Z)Hmax(Z) . (20) 1-I (1 -aiz-1)
i=k+l
According tothe definition in [13], the minimum-phase since this function will have its poles outside Iz[ = 1.
component Hain(Z) of Hmix(Z ) will be 3) Therefore direct and exact inversionof a nonmini-
mum-phase function is not possible, and only the corres-
k pondingminimum-phasepart (or the equivalentmini-
Hmin(Z) = H (1 -- a/z- 1) (21) mum-phase part if the second decomposition is adopted)
g=l can be directly inverted, leaving as residue the maxi-
d. Audio Eng. Soc., Vol. 42, No. 11, 1994 November 895
MOURJOPOULOS PAPERS
mum-phase part (or the all-pass part in the second case), The corresponding spectra are also shown in Fig. 15.
that is, Theresultofdiscrete-timedeconvolutionof theorigi-
nal sequence with its inverse will be
1
Himix(Z)=Hmin(z)Hmax(Z)Himin(Z) -- Hmax(Z)houtmi_(n)= hn_m(n)*himi,(n) = [1.000, 0.000, 0.000, • • • ]
(28)
1 which has a fiat spectrum and linearphase, as shown in
Himix(Z) = Heq(z)Hap(Z)Heq(Z)-Hap(Z) ' Fig. 15.
A.2 Example of Approximate Inversion A.2.2Inversionof Maximum.Phase Function
of Mixed-Phase Functions According to the preceding discussion, direct and ex-
Let us now look at a specific example, which illus- act inversion of the maximum-phase part is not possi-
trates the approximate inversion of nonminimum-phase ble, as
functions. Consider the simple function H(z) = 1 +
az-l, which may represent the case of a very simplified Himax(Z)=1 - a2z- 1 + a_z-2 .... (33)
RTF, consisting of the direct signal and a single reflec-
tion. The corresponding finite-length discrete-time re- will be an unstable function since the amplitude of the
sponse h(n) will be terms will increase without bound. However [20], let us
now consider the expansion of Hi max(Z)in powers of z
h(n) = [1, a] so that h(0) = 1, h(1) = a. rather than z- l,
(29)
For [a] < I, H(z) will be a minimum-phase function, hi max(Z) = 1 Z-- 1 2 .... .
whereas for lal > 1, n(z)will be a maximum-phase
function,thatis, (34)
Hmin(Z)=1+alz-l, [a,I<1Obviously, the function/?/i max(z)will be stable, conver-
(30) gent, and acausal since z- 1represents a delay term and
nmax(Z) = 1 + azz- l, lazl> 1 zrepresents advancement. Therefore Himax(Z)is antici-
patory or, more correctly, acausal. Given that real-life
and systemshavetobecausal,thisfunctioncanbe approxi-
mated by a delayed causal function.
Hmix(Z) -_Hmin(Z)nmax(Z) = (1 q- alz-l)(1 +a2z-l) Let us now introduce an /-term delay on hi max(Z),
(31) resulting in
Hmix(Z )being a general mixed-phase function which is ndi max(Z) = z-I Hi max(Z)
also nonminimum phase.
A.2.1 Inversion of Minimum-PhaseFunction \az/ \az/(35)
According to the preceding discussion, direct and ex-
act inversion of the minimum-phase part is possible,
-[-° °" "Jl-(-- 1)1--1 (_;)
1- 1 -alz-1 +a_z-2 ....
Hmin(Z)-1+aiz-1
(,)z
(32)+(-1)'a__ +-".
Since Jail <1, this function will be stable,causal, and
convergent. The corresponding time-domain sequence After observation of Eq. (35) it is obvious that this func-
hirain(n)will have the same properties. Strictly speaking, tion is now two-sided, having lcausal terms, the rest
exact direct inversion would require an infinite-length being acausal terms. Let us represent the causal part by
sequence hi min(n),but due to the convergence of this nci max(Z)and the acausal part by Hai max(Z),that is,
function, a valid approximation can be obtained by tak-
ing a finite-length part from it. For example, for a1=ndi max(Z) = nci max(Z)"[-Hal max(Z)
0.5,hmin(n) = [1,0.5],and H(z) = 1 + 0.5z-1 (Fig.
15), according to Eq. (32),
Himin(Z ) =1 - 0.5z-l + 0.25z-2 - 0.125z-3 + 0.0625z-4- 0-0313z-5 + 0.0156z-6 - 0-0078z-7 4- "" •
hi rain(n) ---- [1, --0.5,0.25, -0.125, 0.0625, -0.0313, 0.0156, • • • ] .
896 J. Audio Eng.Soc.,Vol. 42, No.11, 1994 November
PAPERS DIGITALEQUALIZATIONOFROOMACOUSTICS
where Hci max(Z),that is,
-- "'a2/(37)
'"a2/(36)
+...+(-1)/ 1} .
The corresponding time-domain sequence/_i max(n)will
~
From Hdi max(z)a causal and stable function can be de- be
rived by truncating the acausal part, that is, by setting
[ '
Hal max(Z) = 0 and defining as the approximate inverse himax(n) = (-- 1)l/a_ ' ' ' , 3, 2, · (38)
to the maximum-phase part to be the causal part ' 0202
hn_(n) 1.3.201oglHr,_ (e_)1
2O1
1.2-dB I
1.1- 15 t
1- 10
0.0-
0.8- S t
J
0.7-oi ....... .
o.6. j. ----__
0.8~4
0.4- .lO
0.3- !
0.2~ -1S
0.1- -201
n
hrr_(n) - [1.o.o.51
(a)
h
Imln(n)201oglH_n_(eJ")l
1.6.
] 't
1.4 dB
15
1.2
1 lo
0.8.t0.6-
/_
0.4. I t J
1
o j _ ! .
-0.2- -10
-0.4--1S
-0.6-
-31)...
-0.8 o
fi
h_r,_n(n)- [1,-0.5, 0.25.-o.1_,0.062_,._]
(b)
hou_(n) - hrain(n)*h Imin(n) 201oglH_tmm(eJ**)l
1.2.-20,
1.1i dB !
0.0_ i
0.8lO!
0.7 Si
0.6
O.Sol
0.4
0.3 -5
0.2I-10:
0.1
04-15.
41,1
-0.2
noJ
hommln(n) - ( 1.oaoo. o.oooo, o.oooo.._,]
(c)
Fig.15.Exampleof direct inversion of minimum-phasefunction. Note that for clarity, all functions are showntostartatn=
! insteadof n=0.
J. Audio Eng. Soc., Vol. 42,No. 11,1994 November 897
MOURJOPOULOS PAPERS
This is a causal and stable finite-length sequence. It is imate inverse,
the approximate inverse to the maximum-phase function
hmax(n). For example, fora2 = 1.8, hmax(n) = [1, 1.8], himax(Z) = 0.556z -4 ..... 0.0294,
and Hmax(Z) = 1 + 1.8z-l [Fig. 16(a)], according to
Eqs. (35)-(38), for delay l= 5 (as an example), hi max(n)= [ -0.0294, 0.0529, -0.0953, 0.1715,
- 0.3086, 0.556]
ndi max(Z) = 0.5556Z-4 -- 0.3086Z-3 + 0. 1715Z-2 [Fig..16(b)]. The result of the convolution of hmax(n)
with hi max(n)will be
- 0.0953z-1 + 0.0529z....
hout max(n) = hmax(n) * ]/i max(n)
-0.0294 + 0.0163z z .... .
= [ - 0.0294, 0.0000, - 0.0001,0.0000,
Setting, after truncation, the causal part to bethe approx- 0.0001,0.0005, 1.0008].
hm,_n) 2.4 201oglH,_(eJ" )1
2.2 dB 2°1.
,si
2
1.11 101.
1,0J
'.4 _
1.21 _J -_-
OJI
OJII.10
q
0.4 .iSt
0.2 J
0
N _
h_(.) = [1.1JI](a)
_=(n) 0.= 20ioOll_"_d '')1
20
o.7 dB
0.0. 15
0.510
0.4
0..1 S
0.2 0 "_
o,1-_
-10
-O.2
-o.,q -15
.O,4
._n
n
_lm_ln) - [-0.D294,0.0629,_,0.171$, _,0.5881
(b)
hu.,x(n) 20loglH_t,w(d" )1
1.2 en_
,.1 dfl, I
1 15i
i
0.0 i0.8 I0
!
0.7 ,;
0.0 i
0.5 0 !-'--
0.4 .6 i
o.3 i
0.2 : -10-
o.1.!
0_-15_
-0.1 --20 '
-0.2
n
houtmx (r,)- I .o.0294. -o.(x_o, -o.o¢ol. -o.oo_. o.ooo% o.o¢_. ,.oo_ I
(c)
Fig. 16. Example of approximate direct inversion of maximum-phase function.
898 J. Audio Eng. Soc.,Vol. 42, No.11, 1994 November
PAPERS DIGITAL EQUALIZATION OF ROOM ACOUSTICS
As is also shown in Fig. 16(c), this is an approximate mum-phase inverse,
delayed unit pulse function, which is sufficiently close
to the exact result in the sense that it has a nearly flat Hi mix(z) = Hi min(z) _'_i max(z)
spectrum and linear phase (except for the constant-de-
lay term), hi mix(t/) = hirain(t/) * _ti max(/'/) ·
This sequence will be causal,stable,and of finite length.
A.2.3 Inversion of Mixed.PhaseFunctions For example, as before, for aI = 0.5 and a2 = 1.8,
According to the preceding discussion,any
mixed-phase function Hmix(z) can be considered as the hmix(n) = hmin(n)* hmax(n)
product of a minimum-phase and a maximum-phase
part. Such a function will then be a nonminimum- = [1, 0.5] *[1, 1.8] = [1.000, 2.300, 0.900]
phase one. Similarly, the approximately finite-length As shown in Fig. 17(a), and
inverse of such a function will be a product of the
minimum-phase inverse and the approximate maxi- Hmix(2) = 1.0 +2.3z-l+0.9z-2 .
h mix(n) - h ._n(n) * h max(,,)
ms,201oglH,n,,(d*)1
zl
2.0-dB
?.4- 1S
2.2-
2- 10
1.11-
1.0- _
'! i _'
OJl._-10
0.01
0.4 -1S
0,2.2o
n
hn'ax(n) - [ _.o. ?.arno. _oo.o.oo_o.o_o,... ]
(a)
h,m_((n) - hI._n(n) * h'_ax(.) 201oglHIm_(eJ# )l
1 30-
o.e dB
0.8' 1S-
0.7.
0.8.10-
0Ji.
OAS-
0.3. /
0.1.
o , T I ,.6-
.0.,?.. -ao.
·0.3. /
-OA.-1S-_
-0.5. -Z
4.4, , ,
n
h link(n)-[ -o.o294, o.Gf)!, .o.12el, o,2_1, -o.42N, o.711_, -0.3M?;0.1B23, -0.0e61,
o.o471-0.[_se, 0.O'M1. -o.oo431
(b)
hou'_ntx(n) - h mix(n)* hlmlx(n)
1.3 201°glH(_'_(d') )1
20-
t.1dB
It 16-
o.e
O.iJ _ 10.
0.7 S-
O.il
0.5 0-
0.4 I
o.i1 .6_1
0.2 -'_04
0.
, -15.
-0.11
n (o
h_',,(n)-[-o._94, .0.o0oo, -o.ooo_,.0.oooo, 0.ooes.1.ooolo.o_o, o.oool, 0.0 ... ]
(c)
Fig. 17. Example of approximate inverstion of mixed-phase function.
J. Audio Eno. Soc., VoL 42, No. 11, 1994 November 899
MOURJOPOULOS PAPERS
Then Thisis a nonminimum-phaseoperator,as showninFig.
17(b). The result of the convolution of hmix(n) with
_/i mix(n) = hi min(n) * _/i max(n)hi mix(n)will be
= [1, - 0.5, 0.25, - 0.125, 0.0625] houtmix(n) = hmix(n)*hi mix(n)
• [- 0.0294, 0.0529, - 0.0953, = [- 0.0294, 0.0000, - 0.0001,0.0000,
0.1715, -0.3086, 0.556] 0.0005, 1.0008,0.0000, 0.0001,
= [-0.0294, 0.0676, -0.1291,0.2361, 0.0000, 0.0000, • • • ]
-0.4266, 0.7693, -0.3847, 0.1923, as shown in Fig. 17(c). This is again an approximate
delayed pulse, which is sufficiently close to the exact
-0.00961,0.0478,-0.0236, result in the sense that it has a nearly fiat spectrum
and linear phase (except for the constant term due to
0.0111,-0.0043]. the delay).
THE AUTHOR
John N. Mourjopoulos was born in Greece in 1954. In the Digital Audio Signal Processing short course held
1977 he received the B.Sc. degree in engineering from from 1986 to 1988 at ISVR and also coorganizer of the
Lanchester Polytechnic in the United Kingdom and in Audio Technology seminar held from 1989 to 1993 in
1979 the M.Sc. degree in acoustics from the Institute Patras, Greece. His research interests are in the areas
of Sound and Vibration Research (ISVR), University of of audio and room acoustics equalization, audio digital
Southampton. In 1984 he completed the Ph.D. degree signal processing, analysis, modeling, and coding, as
at the same institute, working in the areas of digital well as speech enhancement and recognition. He has
signal processing and room acoustics. He also worked also been active as a musician, participating in re-
at ISVR as a research fellow. Since 1986 he has been cordings and concerts and composing music for film and
with the Wire Communications Laboratory, Electrical television. He is a member of the Audio Engineering
Engineering Department, University of Patras, Greece, Society, being currently the chairman of the newly
where he is currently an assistant professor in elec- formed Greek AES Section, and he has presented many
troacoustics, papersatAESConventions.He isalso a memberofthe
Dr. Mourjopoulos was coorganizer and contributor to IEEE and the Helenic Acoustical Society.
900 J. AudioEng.Soc., Vol. 42,No. 11, 1994 November