Conference PaperPDF Available

Examples of constraint-based specification of room acoustic parameters

Authors:
  • aQrate Acoustics Ltd.

Abstract and Figures

The paper provides an overview of the concept of room acoustics specifications, where a set of mathematical, physical or other constraints results in ranges and relationships of room acoustic parameters. In its simplest forms, this approach is known and used widely. Using new types of constraints is useful to reveal priority and relationships among room acoustic parameters and also to avoid unfeasible sets of requirements. From geometrical and statistical considerations to simple modelling approach are presented and compared.
Content may be subject to copyright.
1
Examples of constraint-based specification
of room acoustic parameters
Andor T. Fürjes 1
1Head of acoustic development, Animative Ltd., Budapest, Hungary
Abstract
The paper provides an overview of the concept of room acoustics specifications, where a set of mathematical,
physical or other constraints results in ranges and relationships of room acoustic parameters. In its simplest
forms, this approach is known and used widely. Using new types of constraints is useful to reveal priority and
relationships among room acoustic parameters and also to avoid unfeasible sets of requirements. From
geometrical and statistical considerations to simple modelling approach are presented and compared.
Keywords: room acoustics, specification, room size, natural frequency, diffusion.
1 Introduction
In engineering practice, quality is always measured by some kind of technical parameters and there are usually
limits assigned to those parameters, in order to help decisions or just to distinguish bad, hazardous, risky
of “preferred” situations. In room acoustics the most basic relation is known since the studies by Sabine: in
order to shorten the decay apparent after stopping the sound, one must add more absorption to the room,
because 
, where is the volume of the room, is the absorbing power and is the length of the decay
with respect to the 1:106 change in power. While this relation might seem historical, it is used directly or
indirectly as the basis of the most basic considerations or standards.
Since I learned into engineering room acoustics, I found different forms of rules to follow during the design,
but rarely found well-founded explanations to them. For example, when I was involved to create the new
Hungarian standard of room acoustic specifications, I was asked to create guidelines that would help the reader
to keep the design in favour of better acoustic. Such guidelines are based on some kind of constraints. There I
chose simply to set constraints based on the timing of the first lateral reflections and did end up in a reasonable
guideline for classroom sizing.
This paper revisits some of the known constraint-based guidelines and explores new ways of finding reasonable
specifications.
2 Classroom Sizing
One of the most important items in the discussion of the aforementioned national room acoustic standard was
the case of classrooms. Besides the debate on reverberation time requirements, a set of other suggestions were
also included to support sizing of classrooms.
2
An important observation was, that teachers are not sticked to a position and aiming, but they walk around and
turn around, and therefore their directivity and position shall not be considered. In any case it is important to
direct early energies to the students. If not directly, then by reflections. Other studies have agreed, that the first
50 ms is generally acceptable as an early-late limit in energy ratios the express the effect of room acoustics on
speech intelligibility and clarity [1]. In addition, the 35 ms early-late limit is suggested for cases where clarity
is even more important (e.g. foreign language study, children with hearing difficulties).
If we set a constraint, that from any position there shall be reflections within the first 50 ms or 35 ms, sizes of
the classroom can be given a guideline. Reverberation time control is usually started at the ceiling and hearing
is more sensitive to lateral reflection, so reflections from the ceiling are not considered. Also, reflections from
the floor or desks can be excluded, because they are highly position dependent and also depend on occupancy
and other mobile objects.
Figure 1 shows the result of the constraint to keep the number of lateral early reflection high (taking 8
reflections to be the “best”) and was included in the standard. Interestingly the results agree well with some
usual other conventions (maximum floor area, maximum length, usual number of desks/children etc.). An
important message of this figure is also, that to keep listeners engaged, get them closer, i.e. their number low.
Figure 1 Classroom sizing chart, based on minimum required number of early lateral reflections.
(left: source S is in the corner, right: source S is in front center, both: receivers R within 1m of walls)
3 Concert Hall Sizing
Setting criteria for concert hall dimensions is supported by previous works. A ratio given to support shoebox-
shaped concert hall design is for example:
 and
 (1)
where is the height, is the width and is the length of the hall.
To see this working, let us assume a set of constraints (e.g. [2]):
12345678910 11 12
length (m)
width (m)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
floor max. 27m2floor max. 50m2
9.6 m
7.0 m
9.6 m7.0 m
length
width
SR
1 2 3 4 5 6 7 8 9 10 11 12
length (m)
width (m)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
floor
max. 45 m2
floor max. 86 m2
9.6 m
7.0 m
9.6 m7.0 m
length
width
SR
17.1 m
3
- the width of the room shall be within:  
- number of listener seats: 
- volume of hall: 

- maximum distance to stage (we assume it to be the center of the stage):  
and some usual architectural constraints:
- required floor area of audience (assuming a 0.6×0.9 m:

- stage area:  and .
- width of corridor around audience area: 2.0 m
- maximum number of seats in a row:  
From basic geometric calculations the number of seats, then volume and height can be calculated. Assuming
mean absorptions (NRC) of walls 0.10 and unoccupied audience area 0.50, statistical reverberation times
(Eyring) can be calculated and other room acoustic parameters (G, LF80, C80) can be approximated. Figure 2
shows the generalized floor plan scheme and the result for reverberation time, G using and LF80 using
approximations found in [3] and [4] respectively. There seem to be no characteristic preferences to follow on
those graphs.
Figure 2 Concert Hall sizing chart, based on constraints including floor plan scheme
and statistical room acoustic approximations. Shadowed area denotes a 36×19 m floor area as an example.
If we set the volume constant V = 16000 m3 as a constraint and draw reverberation time as a function of length
and height, omit the maximum number of seats/row, the purpose of (1) seems to reveal (see Figure 3) as to
keep the reverberation time at a maximum. In other words (1) aims actually the right height and not the
length/width ratio.
Nrow (max.) L/3
L
W
Wc
Wc
Wc
Wc
Wc
Wc
Wc
Strength (G, dB)
0 5 10 15 20 25 30 35 40 45 50 55
Length (m)
0
5
10
15
20
25
30
Width (m)
H/W = 0.7
0 5 10 15 20 25 30 35 40 45 50 55
Length (m)
0
5
10
15
20
25
30
Wi dth (m)
H/W = 0.7
Lateral Fraction (LF80 )
0 5 10 15 20 25 30 35 40 45 50 55
0
5
10
15
20
25
30
Length (m)
Wi dth (m)
H/W = 0.7
Reverberation time (s, T60,Eyring )
floor plan scheme
4
Figure 3 Concert Hall sizing chart for reverberation time, using constraints: V = 16000 m3, Nrow not
limited, NRC of walls is 0.15, NRC of seats is 0.50 (unoccupied) or 0.80 (occupied), 0.50 m2/seat.
4 Room Sizing for Low Frequency Transmission in a Rectangular Room
Low frequency behaviour of a room is assumed to be well controlled, if room modes are distributed evenly in
the musically important low frequency region of 20-200 Hz. Suggestions to preferred ratios of length, width
and height are therefore based on evaluation of distribution of natural frequencies.
Natural frequencies of rectangular spaces can be listed using equation

󰇡
󰇢󰇡
󰇢󰇡
󰇢 (2)
where  is the speed of sound, are any combination of natural numbers if .
If the quality of the room is expressed as the mean square of distance of adjacent natural frequencies (based
on [5]), one may get a contour, where also well-known preferred or risky ratios outline. Figure 3 also denotes
preferred area of ratios suggested by [5]:


(3a)
, . (3b)
10 20 30 40 50 60 70 80
5
10
15
20
25
30
L (length, m)
H (height, m)
Reverberation times vs. si zes (unoccupied status )
0.5
0.5
0.5 0.5
1
1
11
1.5
1.5
1.5
1.5
1.9
1.9
1.9 1.9
2.2
2.2
2.2
2.2
2.2
2.4
2.4
2.4
2.4
2.5
2.5
2.5
2.6
2.6
10 20 30 40 50 60 70 80
5
10
15
20
25
30
L (length, m)
H (height, m)
Reverberation times vs. si zes (occupied st atus)
0.5
0.5 0.5 0.5
1
1
11
1.5
1.5
1.5
1.5
1.7
1.7
1.7
1.7
1.7
1.8
1.8
1.8
1.8
1.9
1.9
1.9
2
stage size >50m2
width <22m
height
width > 0.7
length
width < 2
stage size >50m2
width <22m
height
width >0.7
length
width <2
5
Figure 4 Contour plot of quality of distribution of natural frequencies. White dot denote golden ratios,
white line denotes are suggested by inequalities (3a) and (3b).
Despite the benefit to avoid some mistakes, Figure 4 shows that (3a) and (3b) do not guarantee a flawless
distribution of modes.
The topic is well summarized in [6] and introduces another type of quality descriptor called frequency space
index (FSI), which is the normalized relative variance of the distance between adjacent low frequency modes.
An interesting conclusion was, that the quality does depend more on W/L than W/H.
The question is therefore: do these conclusions change if instead of the distribution of eigenmodes, actual
responses within a limited region the room are qualified and compared? The change of view is practical,
because both instruments (sources) and listeners are using only the  ,  and
  region, not the whole volume.
Responses in a rectangular enclosure can be calculated using the mirror image source method. A systematic
series of calculations were run with the following settings in order to see any conclusions:
- geometry is definite (no uncertainties), reflections are purely specular,
- absorption is 0.10 on every surface,
- volume is 200 m3,
- number of possible sources is 10, number of receivers is 25 (overall 250 responses for each dimensional
variation, repeated once to have a total of 500 different random source-receiver position for each L, W and
H),
log mean square of eigentone spacing for 200 m3 room up to 120 Hz
3.0
2.0
1.5
2.5
1.01.0 1.5 2.0 2.5 3.0
golden ratio
L / H
W / H
areas suggested
by conventional
criteria
6
- the source is a 0 dBSPL@1 m omnidirectional source,
- mirror sources up to at least 20 times the diagonal of the room were collected.
Responses were calculated upon the complex summation of coherent image sources at frequencies stepped by
1/12th octave resolution from 20 Hz to 200 Hz. Both were calculated, but interestingly
did hardly differentiate any preferences on ratios, but suggests only to increase height as much as
possible and to keep W/L between 0.6 and 1.3.
The reason to choose the image source method was to investigate other tendencies (e.g. rise time, angle-
dependent reradiation, geometrical uncertainties, asymmetric absorption etc.) in later runs.
Quality of each response can be expressed based on mean absolute deviation from the mean response (Q1),
the standard deviation of the response (Q2) or the difference of maximum and minimum response (Q3). Lower
values denote preferred situations for all quality indicators. A single response and a set of responses for a single
source position is shown in Figure 5.
Figure 5 Calculated responses of a shoebox MISM model, top: single source/receiver and descriptors of
response ‘quality’, bottom: responses from 1 source to 25 receivers, monochromatic at 1/12th octave
frequencies between 20 Hz and 200 Hz, assuming .
Using the assumptions, results of quality descriptors Q1, Q2 and Q3 are shown in Figure 6. It seems, that
characteristic contours of Q1, Q2 and Q3 are similar, but Q3 shows differences more clearly. A combined
qualifier Q* is the average of each normalized qualifier Q1*, Q2* and Q3*, where normalization means to
scale results between 0 and 1, so that Q3* has also values between 0 and 1.
7
Figure 6 Contours of low frequency response qualifiers Q1, Q2 and Q3 for different proportions of a V =
200 m3 reverberant (α = 0.1, ) room. Color axis is in dB (except for right bottom). Results from
1125 different ratios and 500 random source receiver responses each. Bottom right: average of normalized
qualifiers, color axis limited to 0.5 to reveal a better view on minimums.
Figure 7 Enhanced view from Figure 6, showing some preferred ratios from past works (see [6]).
White dashed outlined areas denote preferred ratios for this qualifier.
golden ratio
golden ratio
Walker:1, Rindel: A Rindel: B Rindel: C
Volkman:1
Walker:2
0.1 < L/H - W/H < 0.2
1.0 < L/H - W/H < 1.1
2.8 < L/H + W/H < 2.9
H < 4 m
8
Due to discrepancies from other works, effects of changes of constraints shall be tested in order to make general
conclusions on preferred ratios of room sizes in the case of possible source and receiver positions.
5 Minimum Required Diffusion
It would be highly important, but to the best of the author’s knowledge, there is no objective measure available
that could describe overall diffusity of a room, which could be used similarly to absorption power.
In theory, however it is quite simple to derive a requirement.
If we accept the fact, that the energy decay can be approximated by the function (Eyring)
 󰇛󰇜
(4)
where is time,  is the speed of sound, is the average absorption coefficient and 
is the mean free
path between reflections. The purely specular part is then
 󰇟󰇛󰇜󰇛 󰇜󰇠
(5)
where is the average scattering coefficient of the room.
We may assume, that once a part of the specular incident energy is scattered, it will stay scattered, and that the
total energy is the sum of the purely specular and the non-specular (or scattered) energies at any moment:
   (6)
The diffuse and specular parts are equal when
 
(7)
which yields
 
󰇛󰇜 (8)
where  is the time, where purely specular and non-specular energies are in balance.
This expression is very similar to the Eyring formula:
 
󰇛
󰇜 (9)
so (8) and (9) can be combined to relate specular-to-diffuse balance to reverberation time:
 󰇛
󰇜
 
󰇛󰇜
󰇛
󰇜
󰇛󰇜 (10)
or

󰇛
󰇜
󰇛󰇜 (11)
Using (11) one may set a constraint to measure a stable reverberation time, meaning that from the -5 dB point
on (approx. 32% energy left in the decay) the energy decay curve the non-specular part shall be dominant. This
constraint means, that  
, or from (11):

 

󰇛
󰇜
󰇛󰇜 󰇛󰇜 (12a)
9
Similarly, if the requirement is to have more scattered energy from the -10 dB point on (approx. 10% energy
left in the decay), the necessary average scattering coefficient is:
 󰇛󰇜 (12b)
These results (see Figure 8) suggest, that if the diffuse ratio is at least 0.10, then formulas based on diffuse
field theory shall be surely valid up to 0.15 mean absorption coefficient, and most probably valid up to 0.30
mean absorption coefficient.
Figure 8 Required minimum average scattering coefficient to ensure more scattered energy than specular
energy from the -5 dB or -10 dB of the decay as the function of the average absorption coefficient.
Another constraint might be the minimum required reflection density. In general cases density 
 of number
of reflections is known to be

 
(13)
This formula is sometimes used to derive the mixing time , which shows the point where the diffusely
dense tail of the impulse response begins.
Using the transition time from (8) we can calculate the reflection density at the moment, from when diffuse
reverberation starts to dominate:

 
󰇣
󰇛󰇜󰇤 (14a)

 󰇛󰇜
󰇛󰇜󰇛󰇜
󰇛󰇜
󰇛󰇜 (14b)
In an extensive review and perceptual test control on prediction methods of  (see [7]) the conclusion was,
that perceptual mixing time can be predicted from mean path length, while reverberation time or average
absorption have no significant influence. Formally this agrees with the formula in (14b) if one assumes, that
scattering was the main source of diffusely dense reflection patterns.
Please note however, that (13) does not come from diffuse field theory, so  derived from (13) is rather
about the specular reflections and not the scattered field. This may also explain, why reflection density from
(14b) is in most cases much larger than the empirically preferred minimum of >500 reflection per seconds (see
Figure 9), because scattered decay develops much faster than specular decay density. Figure 9 suggests, that
10
specular reflection density will determine the mixing time only if scattering is low. For example, in a
40×20×20 m hall ( , ), the 
  can be achieved by specular reflections alone if
. In a 9×6×3 m room ( , ), the 
  is fulfilled alone by specular
reflections if . These results question, if mixing times measured or derived only based on reflection
densities do express real transition to the diffuse part or just the start of the subjectively dense reverberation
tail of the response.
Figure 9 Specular reflection density at the time, when diffuse decay starts to dominate over specular decay.
(Color axis is shown in 10-based logarithmic scale for readiblity.)
6 Conclusions
Room acoustic specification is partially based on experience, but to make knowledge-based decisions, it is
important to rely on simple guidelines too, that are consequences of physical, geometrical or other types of
constraints.
The paper gave an overview of some of the aspects that are concluded from such constraints, mainly aiming
to support sizing of rooms for clarity (classroom) or higher reverberation times (concert halls) or for a smoother
low frequency response.
A simple assumption could also lead to explain, why more scattering is required along with higher absorption,
if statistical formulas are expected to be valid. Since there are no general rules of application the scattering of
ordinary surfaces, curently this rule may be used only as an indicator in numerical simulations.
Hopefully similarly practical guidelines can be constructed from simple constraints in order to make room
acoustic design more of an engineering art.
11
References
[1] J. A. T. Whitlock, G. Dodd, “Speech intelligibility in classrooms: specific acoustical needs for primary
school children”, https://www.marshallday.com/media/3064/speech-intelligibility-in-classrooms-
whitlock-dodd.pdf last read 2021.08.16.
[2] L. L. Beranek, Concert Hall Design: New Findings, Proceedings of the Institute of Acoustics, Vol 36.,
Pt. 3 2014.
[3] A. C. Gade et. al., Handbook of Acoustics, 2nd Edition 2014, Chapter 9, ISBN: 978-1-4939-0754-0
[4] A. K. Klosak, A. C. Gade, Relationship between room shape and Early Lateral Energy Fraction in
rectangular concert halls, Proceedings of NAG/DAGA 2009, pp 536-539
[5] R. Walker, “Optimum dimension ratios for studios, control rooms and listening rooms”, BBC Research
Department Report, BBC RD 1993/8
[6] J. H. Rindel, Preferred dimension ratios of small rectangular rooms”, JASA Express Letter 1 (2), 021601
(2021)
[7] A. Lindau, L. Kosanke, S. Weinzierl, “Perceptual evaluation of physical predictors of the mixing time in
binaural room impulse response”, 128th AES Convention, May 2010
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Rooms for music rehearsal, sound studios, control rooms etc. need a smooth frequency response. For that reason, the frequencies of the room modes should be spread as well as possible, and this is controlled by the aspect ratios of the dimensions. The relative variance of the frequency spreading of the lowest 25 room modes is applied as a quality criterion. The results have revealed that the length-width ratio is much more important than the width-height ratio. The length-width ratio should be within 1,15 and 1,45. The height can be chosen more freely without compromising the acoustical quality.
Article
Full-text available
Classrooms for primary school children should be built to criteria based on children's speech intelligibility needs which in some respects - e.g. reverberation time - differ markedly from the traditional criteria for adults. To further identify why the needs of children and adults for speech perception are so different we have measured the 'integration time' of speech for adults and children using a novel technique to obviate the complicating effects of differing language. The results for children are significantly different than for adults (35 ms c.f. 50 ms) and recommendations for classroom design based on the children's requirements have been made. When groups of children engage in 'co-operative learning' activities in the classroom, the "café effect" produces a rising activity noise level. We suggest the Lombard Effect is responsible for this. Measurements show children are more susceptible to the effect and we have developed a prediction model for activity noise in a classroom.
Article
Full-text available
The mixing time of room impulse responses denotes the moment when the diffuse reverberation tail begins. A diffuse sound field can physically be defined by 1) equidistribution of acoustical energy and 2) a uniform acoustical energy flux over the complete solid angle. Accordingly, the perceptual mixing time is the moment when the diffuse tail cannot be distinguished from that of any other position in the room. This provides an opportunity for reducing the length of binaural impulse responses that are dynamically exchanged in virtual acoustic environments (VAEs). Numerous model parameters and empirical features for the prediction of perceptual mixing time in rooms have been proposed. This study aims at a perceptual evaluation of all potential estimators. Therefore, binaural impulse response data sets were collected with an adjustable head and torso simulator for a representative sample of rectangularly shaped rooms. Prediction performance was evaluated by linear regression using results of a listening test where mixing times could be adaptively altered in real time to determine a just audible transition time into a homogeneous diffuse tail. Regression formulae for the perceptual mixing time are presented, conveniently predicting perceptive mixing times to be used in the context of VAEs.
Article
Most control rooms, listening rooms and small studios have dimensions that become comparable with the wavelength of the sound within the normally accepted audio range. Standing waves, where one of the room dimensions is an integer multiple of half the wavelength of the sound energy, result in irregular frequency responses. The response will also be a strong function of position of both the listener and the sound source because of the changing coupling to the spatial distribution of the modes. This behavior is fundamental to sound in enclosed spaces and is accentuated if the room proportions cause clustering of the modes. For practical purposes, it is necessary to define a range of acceptable room proportions from which selections may be made to suit particular cases. This requires the derivation of some form of criterion identifying regions of 'good' and 'poor' room proportions. This report describes the development of a computer program to produce plots of the distribution of an index of room 'quality' for rectangular rooms with different proportions. From these plots, regions of 'good' and 'poor' rooms were identified. A general design criterion was derived to restrict room proportions to those giving rooms with more even distributions of low frequency modes.
Concert Hall Design: New Findings
  • L L Beranek
L. L. Beranek, "Concert Hall Design: New Findings", Proceedings of the Institute of Acoustics, Vol 36., Pt. 3 2014.
Handbook of Acoustics
  • A C Gade
A. C. Gade et. al., "Handbook of Acoustics", 2 nd Edition 2014, Chapter 9, ISBN: 978-1-4939-0754-0
Relationship between room shape and Early Lateral Energy Fraction in rectangular concert halls
  • A K Klosak
  • A C Gade
A. K. Klosak, A. C. Gade, "Relationship between room shape and Early Lateral Energy Fraction in rectangular concert halls", Proceedings of NAG/DAGA 2009, pp 536-539