ArticlePDF Available

Classic stereo imaging transforms—a review

Article

Classic stereo imaging transforms—a review

Figures

Content may be subject to copyright.
Joseph Anderson 1
Computer Music Journal February 10, 2008
Classic Stereo Imaging Transforms—A Review
Joseph Anderson
Scarborough ElectroAcoustic/Hull ElectroAcoustic Research Centre
School of Arts and New Media
University of Hull
Scarborough, YO11 3AZ, UK
j.anderson@hull.ac.uk
Introduction
The present survey is a result of the author’s two interrelated pursuits in the art
of recorded sound: firstly as a composer of ‘acousmatic’ music (Dhomont 1996;
Windsor 2000), and additionally as a ‘purist’ sound recordist (Lipshitz 1986). The
collection of sound recordings for use as the building blocks of a new sound world is
a very important activity within the British tradition of acousmatic music.
Improvising with sound-making around a pair of microphones in the studio is the
usual first step in composing for this form. In practice, the choice of microphone
techniques (pattern and array types) actually used may vary, but the result will be a
stereophonic recording. While it may be simpler in production terms to start with
single track monophonic recordings, the skilled acousmatic composer begins with
stereo so that the fabric of the piece to be created will be embodied with the spatial
attributes1 (Rumsey 2002) already present in the initial improvised source material
recordings. Similarly, the purist sound recordist is concerned with making stereo
recordings that successfully capture the spatial attributes of acoustic events, usually,
1 These include width, depth, distance, envelopment, spaciousness, spatial
impression, etc.
Joseph Anderson 2
Computer Music Journal February 10, 2008
but not limited to, musical performances.
When it comes to managing the resulting recordings, both the acousmatic and
the purist have a similar problem. The composer needs a handle or a set of tools to
address the spatial attributes of the collected sounds if this is to be one of the
parameters to be manipulated to create meaning—and for the artist to be regarded
as skilled, this is expected. For the purist recordist, the matter may be more
remedial. Particularly for recordings made at concert events, where staging and
sightline concerns may not always result in well balanced, centred or otherwise
spatially well represented recordings, reparative action may well be called for. The
purist recordist may need to virtually re-aim, re-balance or otherwise re-jig the
result, essentially ‘spatially remixing’, to produce a more appropriate and usable
recording. In mentioning the acousmatic and the purist, the intention is not to limit
the audience for the overview to follow. Ideally one might suppose all audio
professionals should be familiar with the techniques to be discussed, and in
particular, the mastering and mix engineer.
The intention of this discussion is to review a variety of classic stereo imaging
transforms, and while these are not necessarily able to address all the features of a
stereo recording that may be termed as spatial attributes, a range of qualities can be
ergonomically managed. While many of the transforms to be surveyed are known
and well used by audio professionals, a number of these are not, and have often
been regarded as obscure and mysterious. The transforms themselves are presented
in the form of variations on the sine-cosine panning law, which is also referred to in
the literature as the tangent panning law or stereophonic law of sines. (Malham 1998;
Griesinger 2002) Inspired by Julstrom’s (1991) ‘stereo polar diagrams’, which he uses
to illustrate coincident stereo microphone array sensitivity and the arrangement of
the resulting stereo stage, changes applied by the transforms to a stereo field are
Joseph Anderson 3
Computer Music Journal February 10, 2008
illustrated through the use of figures similar to the display of a goniometer.2 The
intention is that of ‘a thousand words’, hopefully leading the reader to an intuitive
understanding of the action of the imaging transforms. Much of this work is the
result of reading and reflection on the work of Michael Gerzon, and the intention is
to make an attempt at illustrating and unifying a number of his discussions on the
topic. Finally, in closing, the notion of frequency dependent image transforms is
touched upon, with further reading of Gerzon suggested.
Sine-Cosine Panning Law and the MS Domain
Panorama Law
Many, if not most, audio practitioners begin their experience with stereo imaging
through the use of a panning law, positioning a monophonic sound onto a stereo
stage. In describing ‘stereo’, this discussion refers to a two channel stereo system
with two loudspeakers placed in front of a listener in an arc, usually subtended to
sixty degrees. Furthermore, many of the imaging impressions to be described
throughout are only apparent via loudspeaker playback and are in many cases not
perceived when a stereo signal is auditioned over headphones.3
All stereo mixing desks and digital audio workstations incorporate a panorama or
panning law of some sort to position monophonic signals within a stereo stage; this is
achieved by distributing the input signal among the two output loudspeaker
2 Often referred to as stereo phase scope.
3 The problem of headphone listening for spatial qualities has particular issues
divergent from loudspeaker listening, and is best considered within the context of
binaural techniques.
Joseph Anderson 4
Computer Music Journal February 10, 2008
channels. Perhaps the most commonly used panning law for stereo hardware desks
is the sine-cosine panning law, which applies gains to the input signal as
!
Lpan =cos 45°"
#
p
( )
M
Rpan =sin 45°"
#
p
( )
M
(1)
where
!
M
is an input monophonic signal and
!
"
p
is the desired panning angle.
!
"
p
varies between +45° for panning fully to the left loudspeaker and -45° for panning to
the right. Assigning
!
"
p
to 0° positions the input signal
!
M
in the centre of the
resulting image. In his comments regarding the convenience of using the sine-cosine
law, Griesinger (2002) has noted, “The sine-cosine law has the advantage of
maintaining constant energy as the apparent position is varied, and has a long
history in use.” This constant energy with varying position allows the perception of
constant loudness with many types of input signals.
Wearing the hat of the purist recordist, it is worth mentioning the sine-cosine
panning law may also be implemented through the use of a stereo microphone
technique known as the Blumlein (1931; Gerzon 1976a; Lipshitz 1986) or Stereosonic
(Clark, Dutton, and Vanderlyn 1958) array. Two crossed bi-directional microphones
are used, the left microphone aimed 45° to the left of centre with the right
microphone at 45° to the right of centre. This technique is well regarded for its
ability to create vivid, stable, lively, and ‘objective’ recordings, placing the listener in
the space of the recording.
The MS Domain
Many in the audio community associate the MS domain (middle-and-side or
mono-and-stereo) with the MS stereo microphone technique (Dooley and Streicher
1982). As with other coincident techniques, the MS microphone technique has much
Joseph Anderson 5
Computer Music Journal February 10, 2008
to recommend it—ease of deployment, vivid and accurate stereo imaging.
Introduced in Blumlein’s 1931 patent, the MS domain should not be regarded as tied
to microphones and sound acquisition4; the MS domain is merely an alternative way
to represent or view a stereo signal. A ‘standard’ stereo signal, consisting of left and
right signals, is a stereo signal in the LR domain (left-right). The LR and MS domains
are best thought of as two sides of the same coin; change to the stereo signal in one
domain is reflected in the other.
Transforming from the LR domain to the MS domain is achieved as follows
!
M=2
2L+R
( )
S=2
2L"R
( )
(2)
Similarly, the transform from MS to LR is
!
L=2
2M+S
( )
R=2
2M"S
( )
(3)
These two transforms are orthogonal; no information is lost moving from one
domain to the other. Transforming an LR stereo signal into the MS domain via
Equation 2 and then back into the LR domain via Equation 3 will result in returning
the original stereo signal, with no change.
Panning a mono input signal via the sine-cosine panning law is realized
(4)
in the MS domain.
!
M
,
!
"
p
and the resulting panning is as above in Equation 1 when
the MS stereo signal is transformed into the LR domain via Equation 3. Notice that
panning a mono signal in the MS domain appears to be slightly simpler than in the
4 FM stereo radio transmissions are broadcast in the MS stereo domain.
Joseph Anderson 6
Computer Music Journal February 10, 2008
LR domain, but with the added expense of transforming the result to the LR
domain. We’ll see this becomes the case with a number of the transforms to be
reviewed; some are more convenient to implement in the LR domain, others, in the
MS domain.
Visualizing the relationship between the LR and MS domains is an important
place to begin our survey. Those familiar with reading goniometers will likely
recognize the plot shown in Figure 1. The LR axis is plotted against the MS axis,
with angles marked at every 15° and gain marked at 6dB increments. The plot shows
three mono signals, each with a gain of 0dB, panned at +30°, 0° and -45° into a
resulting stereo image. This plot is not exactly what a goniometer would return for
the signals described. Firstly, a goniometer is not able to represent three signals
occurring simultaneously and give separate and accurate azimuth angles for each, as
is shown here. Instead, various Lissajous figures would be displayed for multiple
signals, and these vary depending on each of the signals and their panning
positions. Secondly, even for a stereo image consisting of a single panned mono
signal, a goniometer has further limitations. A goniometer isn’t able to represent
polarity; a signal panned to +30° will appear the same as one panned to -150°.
However, as can be seen from Equation 1, these two signals will have opposite
polarity; ideally a representation of a stereo image will take account of this.
The illustrations used throughout this discussion adopt a method of displaying
encoded panning angle and gain across the complete 360°: azimuth angle as
displacement around the circle, and gain as the radius from the centre. The stereo
image shown in Figure 1 is easily constructed through the use of hardware or
software stereo mixers, as all three mono signals are panned between +45° and -45°
(+L/R), left and right. Figure 2 illustrates a stereo image with 0dB mono signals
panned from 0° to -15°, at 15° increments. Constructing this sort of image requires
Joseph Anderson 7
Computer Music Journal February 10, 2008
direct access to the sine-cosine panning law as elements are panned beyond the +45°
and -45° usually given as limits in mixers intended for a general user audience. (See
further discussion on this topic below.) The signal shown in Figure 2 will be used as
an identity signal and transforms to be reviewed in the remainder of this text will be
illustrated by acting upon this stereo signal.
The Stereo Transforms
Rotate—Stereo Panning
While rotate is not usually one of the first stereo transforms most audio
practitioners have had experience with, we’ll begin with it here. Rotate is easily
constructed from the sine-cosine panning law, and it will be seen that the all the
other stereo transforms to be reviewed may be regarded as simply variations on this
transform. Rotate may be viewed as a truly stereo panning algorithm as it accepts a
stereo input and yields a stereo output. It acts to reposition the elements in a stereo
image without adjusting their relative gains, as it will be seen the other transforms to
be considered do. Where the sine-cosine panning law is used to position a mono
signal, rotate is used to position (or reposition) a stereo signal and should be
regarded as the equivalent operation to panning for stereo signals. It is surprising
then, that while stereo hardware and software mixing desks employ panning, very
few implement rotation, true stereo panning.
It is easiest to see the relationship between panning and rotation by beginning in
the MS domain; the stereo rotation transform is implemented by adding scalars for
the input
!
S
component of a stereo signal to the MS panning law of Equation 4:
!
Mrotate =cos
"
r
( )
M#sin
"
r
( )
S
Srotate =sin
"
r
( )
M+cos
"
r
( )
S
(5)
Joseph Anderson 8
Computer Music Journal February 10, 2008
!
"
r
is the rotation angle. As would be expected, a rotation of +45° will place what was
in the centre of the image at the left loudspeaker, while -45° rotates what was in the
centre to the right loudspeaker. In the LR domain, rotation is implemented
!
Lrotate =cos
"
r
( )
L+sin
"
r
( )
R
Rrotate =#sin
"
r
( )
L+cos
"
r
( )
R
(6)
Rotation can, of course, be used both creatively and correctively. Clearly, just as
one might wish to position a monophonic signal in a stereo mix, placing it at some
desired azimuth angle, one may wish to do the same with a stereo signal,
positioning it within a stereo mix. Using rotate preserves all the vivid cues and
spatial information in a stereo signal while altering azimuth. One corrective use of
rotate might be to re-image a stereo recording so that an element intended to appear
in the centre of the image, perhaps a singer, is brought into the centre.
Panning beyond +/-45
°
—moving beyond the loudspeakers
When introducing the sine-cosine panning law, the panning angle
!
"
p
was listed
as restricted between +/-45°. While this range is sufficient to place a sound between
the loudspeakers, there is nothing in the panning law to limit the choice of
!
"
p
to
these values. One can just as easily choose
!
"
p
between +/-60°, or +/-90°, or even
+/-180°. There may be reasons not to do so, particularly if there are concerns with
mono playback compatibility suitable for broadcast. However, if one is targeting
stereo and is interested in stereo imaging, it is useful to consider what happens
when moving beyond +/-45°.
Those who have experience with the Blumlein stereo microphone technique will
recognize the regions from +45° to +135° (+L to -R) and -45° to -135° (+R to -L) as the
‘phasey’ pick-up regions of the microphone array. This phasiness has been described
as producing vague and ambiguous localization, and for the recording engineer
Joseph Anderson 9
Computer Music Journal February 10, 2008
concerned with accurate and convincing stereo illusion, is usually avoided. These
regions are called phasey as the sine-cosine law (and/or Blumlein microphone
array) begins to introduce antiphase, or opposite polarity, signals at the opposite
loudspeaker. Just past +45° (+L), the left loudspeaker, the amount of antiphase
signal on the right loudspeaker is minimal—and depending on the sound material,
may appear to edge just past the left loudspeaker. Further increasing
!
"
p
the sound
may appear to move further past the loudspeaker, usually becoming delocalized as
vague and hazy. When
!
"
p
is equal to +90° (+S), the left and right loudspeakers are
fed with signals of the same gain, but opposite polarity. Kendall (1995) has described
the impression of such a signal as appearing close to the listener. For the author, the
panned sound feels slightly behind the head.
Returning to rotation and applying
!
"
r
of 0° to the identity signal of Figure 2 will
return the signal in Figure 2. This is the case for all the transforms to be reviewed;
applying 0° results in a transparent transform. Applying a rotation of +7.5°, +15°
and +30° will return the signals illustrated in Figure 3, Figure 4 and Figure 5. As can
be seen in all of these, elements of the identity image that were previously between
+/-45°, left and right, are now rotated into phasey regions. Similarly, elements
previously in phasey regions are now between +/-45°. We will see that all the
transforms are similar in this aspect, shifting some elements from the non-phasey
regions to the phasey, and vice versa.
Width
Joseph Anderson 10
Computer Music Journal February 10, 2008
Along with balance, the width5 transform is one of the more familiar stereo
imaging transforms. In the LR domain, the transform for width can be created by
modifying the equation for rotation (Equation 6). For the scalars on the right
channel, R,
!
"
r
is replaced with
!
"
#
r
, resulting in
!
Lwidth =cos
"
w
( )
L#sin
"
w
( )
R
Rwidth =#sin
"
w
( )
L+cos
"
w
( )
R
(7)
This changes the direction of rotation applied on the right channel. Rather than
rotating both L and R through the stereo stage together, R is rotated in the opposite
direction to L. Negative values from 0° to -45° will narrow the width of the image;
this range of values rotates L and R towards the centre, with -45° resulting in
collapsing the image to mono at centre of the stereo stage. Figure 6 illustrates
transforming the width of the identity signal by -30°. Compare this illustration to
Figure 5, a rotation by +30°. Note the mark indicating the transformed location of
the input +R for both of these appears at the same azimuth, -15°. This should not be
especially surprising considering the discussion above. The input +L has been
rotated by -30° (clockwise) while +R has been rotated by +30°, (counter-clockwise) in
the opposite direction.
Reviewing against the input identity signal, other features of the width transform
may be observed. Along with the compression of elements towards the centre,
notice the gain changes applied to the elements of the input. For the elements in the
frontal +/-45°, the gain changes are minimal, the most significant being around a
2dB increase applied to the centre element of the image. More significantly,
5 The author has additionally seen the width transform referred to as the stereo
base, stereo basis and stereo differential control.
Joseph Anderson 11
Computer Music Journal February 10, 2008
however, is the gain reduction applied to elements in the phasey regions (between
+L/-R and –L/+R) of the input. The elements located on the S axis have had a gain
reduction of nearly 9dB. Also, azimuth displacements have been applied to all
elements of the input signal except those on the M and the S axes.
Positive values will in principle increase the width of an image, though the result
may not be as simple as first appears. Figure 7 illustrates a transform of +15°,
resulting in the widening of the front +/-45° stereo stage. It can be seen that the
elements on the L and R axes have been displaced into the phasey region. As
mentioned earlier, depending on the sound material, these elements may now
appear just beyond the left and right loudspeakers. Additionally, what was at
+/-30° now appears on the L and R axes.
Transforming the width by +30° results in the signal illustrated in Figure 8,
producing a significant distortion of the image. Perhaps the most notable is the gain
applied to elements across the front +/-45° stage; elements on the M axis are now
down by nearly 9dB. As for azimuth displacements, elements are now compressed
towards the S axis rather than the M axis, as was the case for width by -30°. As we
did for the width transform by -30°, compare the illustration for width by +30° with
that for rotation by +30° (Figure 8 and Figure 5). Notice that elements which were at
+45° and +15° in the identity signal now appear at +75° and +45° for both transform
results. While these two elements are at the same azimuths, the gains for the element
now at +45° is different by 6dB.
Width presented in the MS domain is probably the form most readers have
encountered the transform:
!
Mwidth =2 sin 45°"
#
w
( )
M
Swidth =2 cos 45°"
#
w
( )
S
(8)
Joseph Anderson 12
Computer Music Journal February 10, 2008
Here it becomes apparent that width changes the balance of M to S. As we have
seen, this is equivalent to rotations of L and R in opposite directions. However, the
action of width seen above, of changing gains on the M and S axes, becomes very
explicit. As Equation 8 presents the transform in terms of the sine-cosine panning
law, the gain on M varies as width is changed. For narrowing widths,
!
"
w
equal to
negative values towards -45°, gain on M increases towards +3dB. For increasing
widths,
!
"
w
to +45°, gain on M decreases towards
!
"#
dB. Gain on M in Equation 8
may be normalized to 0dB for all values of
!
"
w
, and this is the form the width
transform is often seen to appear. However, presenting it as seen in Equation 8
makes the rotations and relations to the transforms currently reviewed more
apparent.
Balance
Balance is perhaps the most familiar of the stereo imaging transforms due to the
fact it is regularly found implemented in consumer audio devices. It can be seen as
merely swapping the domain the width transform is applied in; later we will see this
domain swap is a useful way to create other transforms, as well. In the LR domain,
width operates by rotating the L and R axes towards or away from the M axis. The
same tactic can be applied to rotation in the MS domain. For the scalars on the side
channel, S,
!
"
r
in Equation 5 is replaced with
!
"
#
r
, resulting in
!
Mbalance =cos
"
b
( )
M+sin
"
b
( )
S
Sbalance =sin
"
b
( )
M+cos
"
b
( )
S
(9)
The balance transform, then, is a rotation of what was encoded on the M and S axes
towards or away from the L axis. In the LR domain balance appears as
Joseph Anderson 13
Computer Music Journal February 10, 2008
!
Lbalance =2 cos 45°"
#
b
( )
L
Rbalance =2 sin 45°"
#
b
( )
R
(10)
It is this form, or a gain normalized form, that balance is often presented. Here, in
Equation 10, the scalars appearing on L and R appear very similar to those
appearing on M and S in Equation 8, and the close relationship between width and
balance can be seen.
Figure 9 and Figure 10 illustrate the result of applying balance by +15° and +30°
to the identity signal. As with width, not only are the azimuths of the elements
changed, but gain is altered as well. In the case of balance by +30°, the elements
between +M and –S are reduced in gain, with the element on the R axis down by
nearly 9dB. Gain on the L axis is up by about 2dB.
Comparing the illustrations for balance by +30° and rotation by +30° (Figure 5) it
is possible to add more insight to the azimuth distortions produced by the balance
transform. Notice that what was at 0° in the identity signal appears at +30° in both
the balance and the rotation result. Similarly, the element previously at -30° in the
input now appears at 0° in both the balance and rotate illustrations. However, where
rotate leaves the gain of this element unchanged, balance reduces the gain by 6dB.
Also, be clear that while rotate moves all elements across the stereo stage, balance
keeps the elements on the L and R axes anchored to their original locations. Moving
in the other direction, Figure 11 illustrates balance by -30°, compressing the image
towards the right axis.
Middle Panorama
Joseph Anderson 14
Computer Music Journal February 10, 2008
Along with asymmetry, middle panorama6 or m-pan is one of the lesser known
imaging transforms, though it has appeared in a number of software and hardware
mixing and mastering products. The transforms encountered so far, rotation, width
and balance, apply a rotation to both axes in a domain, LR or MS. In contrast, m-pan,
along with the remainder of the transforms to be seen, apply rotation to only one
axis. In the case of m-pan, it is to the middle axis that rotation is applied. Returning
to rotation in the MS domain (Equation 5), for the scalars on the side channel, S,
!
"
r
is
replaced with 0°, resulting in
!
Mm"pan =cos
#
m
( )
M
Sm"pan =sin
#
m
( )
M+S
(11)
M-pan, then, is a rotation of M, but with S kept in place. Rendering the transform in
the LR domain is much less elegant, so for aesthetic reasons the LR form will be
omitted here.
Figure 12 and Figure 13 illustrate the result of transforming the identity signal
with m-pan by +15° and +30°. Let us consider the results of m-pan in the light of
both rotation and balance by reviewing the illustration for m-pan by +30° with those
for both rotate and balance by +30° (Figure 5 and Figure 10). Notice that the element
originally encoded at 0° is now displaced to +30° by all these transforms, however
there are significant differences in gain distortions and azimuth displacements. For
the same displacement of what was at 0°, m-pan makes much more reduced gain
changes than does balance, approximately 3dB down at L rather 9dB. Compared to
rotate, azimuth changes to the overall signal are reduced, where change is primarily
6 The author has additionally seen middle panorama referred to as centre pan,
direction mix, and direction control.
Joseph Anderson 15
Computer Music Journal February 10, 2008
focused on modifying what was at the centre of the image. It is for these reasons that
in a number of cases the m-pan algorithm may be preferred over both balance and
rotate to adjust a stereo field.
Asymmetry
Asymmetry7 is a stereo transform that has often been regarded as obscure,
mysterious and is relatively rarely mentioned. The author first encountered
asymmetry in a commercially available software stereo imaging tool, and it is the
investigations to understand this transform that have led to the current survey of
imaging techniques derived from the sine-cosine panning law. Asymmetry is closely
related to m-pan, but where m-pan rotates the middle axis, the asymmetry
transform rotates the side axis, keeping the middle in place. Reviewing rotation in
the MS domain (Equation 5), for the scalars on the middle channel, M,
!
"
r
is replaced
with 0°, resulting in
!
Masymmetry =M"sin
#
a
( )
S
Sasymmetry =cos
#
a
( )
S
(12)
The reader should notice a strong similarity between Equations 12 and 11. As with
m-pan, rendering asymmetry in the LR domain is significantly less elegant, so is
omitted here.
Figure 14 and Figure 15 illustrate the result of applying the asymmetry transform
with arguments of +30° and +60° to the identity signal. Comparing the figures for
asymmetry and m-pan by +30° (Figure 13) notice that asymmetry keeps what was at
7 The author has additionally seen asymmetry referred to as the stereo stability
control.
Joseph Anderson 16
Computer Music Journal February 10, 2008
+/-M anchored to the M axis as expected. Also note the new locations of the
elements that were on the L and R axes. For the same argument, asymmetry and
m-pan displace these to the same azimuth, but with different gains. Asymmetry and
m-pan by +30° both move what was on the R axis to -30°, a displacement of +15°.
However, where asymmetry has increased the gain of this element by about 2dB,
m-pan has resulted in a reduction of about 3dB. Having seen this 15° translation of
elements previously on the L and R axes, it may be beneficial for the reader to
review these two illustrations with respect to Figure 4, a rotation of +15°.
Aside from the more obvious re-imaging applications involving the re-
arrangement of an existing stereo image while retaining the central element in the
centre of the image, Gerzon (1990) has suggested a number of very creative remedial
uses of asymmetry. Gerzon’s discussion is of interest to the mastering engineer who
is required to produce a usable stereo result from an initial recording which has
intermittent technical issues with one (or both) of the channels. Of particular interest
is the introduction of frequency and/or amplitude dependencies to the algorithm.
As mentioned in the introduction, a brief account of frequency dependency will be
explored in the final section of this current review. The interested reader is referred
on to Gerzon’s enlightened considerations.
Left and Right Panorama
A discussion of stereo imaging transforms would not be complete without a look
at what might rightly be described at the two transforms users of stereo mixing
desks are most familiar with. As they are readily implemented by users of these
devices, these two procedures may not even be regarded as worthy of the
nomenclature ‘transforms’. However, it is possible to choose a wider range of values
to apply than mixing desks usually allow, giving greater flexibility. Confined to the
possibilities of hardware desks, left panorama and right panorama, or l-pan and r-pan,
Joseph Anderson 17
Computer Music Journal February 10, 2008
may be implemented by simply adjusting the pan of an input left or right channel. In
practice, l-pan and r-pan are often used together to adjust the positioning of
elements in, and width of, a stereo field.8
R-pan
The transform for r-pan may be developed from the equation for rotation in the
LR domain (Equation 6). For the scalars on the left channel, L,
!
"
r
is replaced with 0°,
resulting in
!
Lr"pan =L+sin
#
rp
( )
R
Rr"pan =cos
#
rp
( )
R
(13)
Figure 16 and Figure 17 illustrate and r-pan transformation of the identity signal by
+15° and +30°. Adding the illustration for r-pan by +15° to the comparison made
above, m-pan and asymmetry by +30° with rotation by +15° (Figure 13, Figure 14
and Figure 4), further insights may be gained. All these place what was on the R axis
at -30°. Not un-expectedly, where m-pan, asymmetry and rotation shift what was on
the L axis into the phasey region, r-pan locks the left axis in place. Consequently the
resulting image, gains and placement of elements differ, too.
Turning attention to the element that was on the M axis, review the illustrations
for r-pan by +30° and m-pan, balance and rotate by +15° (Figure 17, Figure 12,
Figure 9 and Figure 4). Observe that while the position of the element formerly on
8 Recall from the discussion on width, that the width transform can be seen as
linking l-pan and r-pan together, assigning a negative value to the rotation angle
given to R.
Joseph Anderson 18
Computer Music Journal February 10, 2008
the M axis is now at +15° in all these, the resulting distribution and gain changes to
the other elements are varied. R-pan and balance keep what was between +/-45° in
the non-phasey region, while m-pan and rotate, to varying degrees, do not. From a
production point of view, these transforms give different options and creative
results for the task of moving what was at centre to another azimuth.
L-pan
The l-pan transform completes the set. As would be expected, the equation for
l-pan may be derived from rotate in the LR domain (Equation 6).
!
"
r
is replaced with
0° for the scalars on the right channel, R, giving
!
Ll"pan =cos
#
lp
( )
L
Rl"pan ="sin
#
lp
( )
L+R
(14)
Seeing both Equations 13 and 14 it should not be surprising these resemble
Equations 11 and 12, those for m-pan and asymmetry. Both sets act in the same
manner, rotating one of the axes of the input signal in either the LR or MS domain.
Within Equation 14 there is nothing to restrict values assigned to
!
"
lp
to keep
what was previously between +/-45° in the non-phasey region. Figure 18 and Figure
19 illustrate the result of transforming the identity signal with l-pan by +15° and
+30°. In general, hardware mixing desks are incapable of the transform illustrated
here because their implementations of the panning law restrict l-pan and r-pan to
the non-phasey region. For a creative audio practitioner this restriction may be
limiting and undesirable. Reviewing the illustrations for l-pan and rotation by +15°,
and m-pan and asymmetry by +30° (Figure 18, Figure 4, Figure 13 and Figure 14)
one sees all these transforms place what was on the L axis at +60°. With the
exception of rotation, in this case l-pan changes the gain of the elements the least.
Joseph Anderson 19
Computer Music Journal February 10, 2008
Direct-ability
While the author has seen hardware and software imaging tools implemented as
a cascade of a number of the transforms reviewed, it should be apparent that such
arrangements are in fact redundant, possibly counter intuitive, and potentially
detrimental. A useful exception to this caution is to note it is possible to modify the
basic transforms so they become ‘direct-able’. The relationship between width and
balance is one of ‘direct-ability’: width is across one domain while balance is across
the other. Each of the transforms can be modified so they are direct-able by first
rotating the azimuth of interest into the centre of the image, secondly, performing
the desired transform, and finally rotating the azimuth of interest back to its original
position. The (LR domain) network shown in Figure 20 illustrates the arrangement
described, with
!
"
r
as the azimuth of interest and
!
"
i
as the imaging transform
argument. Details of implementations (and simplifications) will be left to the reader,
but it is the author’s opinion that deploying the imagers in this way presents the
user with the most ergonomic methods to adjust and shape the spatial imagery of
stereo signal. In fact, all the transforms presented to this point are possible to realise
through the use of a single (e.g., width) direct-able imaging transform followed by a
rotation. The reason, of course, is that all the transforms are merely rotation of one
or more axes in one of the domains.
Frequency Dependent Transforms
The transforms reviewed above in the forms shown are quite powerful, allowing
one to reshape and reposition stereo images to both creative and remedial ends.
Creative choices about positioning and impression can be made, or corrective
actions to resolve difficulties with the stereo image can be taken. In introducing the
addition of frequency selectivity to these tools, one shouldn’t forget the leverage the
Joseph Anderson 20
Computer Music Journal February 10, 2008
non-frequency dependent implementations provide. Not unsurprisingly, adding
frequency discrimination to the transforms reviewed further extends the range of
possible interventions on the image, some of these with more predictable outcomes
than others.
In principal, frequency selectivity can be added through the use of some sort of
crossover network. The exact network chosen may vary. In the most simple case,
one can imagine separating an input signal into two frequency bands, then applying
one transform to one band while leaving the other untouched. Figure illustrates the
network described. For a simple case like this, the crossover network should (likely)
be phase equalised, where the phase at the crossover frequency is matched for both
bands. The well-known Linkwitz-Riley network has this characteristic (Bohn 2005)
and is therefore a reasonable choice here. Depending on the application, more
complicated phase equalised networks may also be used. In some cases these
crossovers can be implemented as unitary feedback networks (Gerzon 1976b),
though FIR networks can also be used. This isn’t to say, however, that non-phased
matched networks haven’t, in practice, been used for some applications. In general,
phase unmatched crossover networks are advised against, as this can introduce an
uncomfortable phasiness to the resulting stereo signal, which is usually undesired.
Spatial Equalisation
Since Lord Rayleigh’s (1907) initial investigations into hearing, there has been an
awareness that different mechanism come into play for hearing at low and high
frequencies. Without getting into the details of these two (or more) mechanisms, our
concern for stereo playback is the result that low frequencies appear to subtend
across a narrower angle than do high frequencies; or, in other words, the low
frequency stage appears more narrow than that for high. Blumlein’s patent from
1931 laid the groundwork for a class of processing usually referred to as ‘stereo
Joseph Anderson 21
Computer Music Journal February 10, 2008
shuffling’. (Clark, Dutton, and Vanderlyn 1958; Harwood 1968; Gerzon 1986) Of
these techniques, the process Griesinger (1986) has called spatial equalisation is of
particular interest, having been first deployed in the mid 1950s as part of the
Stereosonic system to balance the width of the low and high frequency stereo stages.
From our brief discussion of frequency selectivity, implementations may come to
mind. Low and high frequencies are to be separated, processed for width
accordingly, and then summed together, resulting in an image that has the same
apparent width across the high and low stereo stages. The author finds the use of a
2nd-order Linkwitz-Riley network to be the most convenient crossover, with the low
stage, rather than the high, being adjusted for width. Griesinger (1986), Gerzon
(1986, 1994) and Lipshitz (1986; Lipshitz, Griesinger, Gerzon 1987) have discussed
spatial equalisation and its applications in detail. Gerzon’s review is quite long and
informative, and interested readers are suggested to consult Gerzon’s elucidating
discussion on the topic.
For the purist sound recordist, the author regards spatial equalisation to be an
indispensable part of the recordist’s toolkit. While admired for clarity and definition
of image, single point coincident microphone techniques are often criticised for ‘lack
of spaciousness’ when compared to spaced techniques.9 By broadening the width of
the low frequency stage, the image of a coincident recording can be appropriately
expanded, while retaining the clarity of image definition admired for the technique
and often missing in non-coincident recordings. Similarly, stereo images constructed
in the studio with pan potted mono signals may be improved through the
application of spatial equalisation, adding broadness and expanding a sense of
spatial immersion and stability by matching high and low stages. It is for these and
9 Lipshitz has noted ‘Spaciousness’ in non-coincident techniques is usually the
result of conflicting, missing or confused localisation cues.
Joseph Anderson 22
Computer Music Journal February 10, 2008
the above reasons the author is surprised spatial equalisation is not regularly
included as part of the default toolset of stereo digital audio workstations,
particularly as it has appeared in the literature for some time.
Stereo Spreading
Regardless of the variety of architectures that have appeared in the literature, at
the root of it, stereo spreading can be regarded as the application of frequency
selective rotation to a stereo image. The goal is to broaden an image by rotating or
‘spreading’ the frequencies of an input across the stereo stage. In the simplest (and
least effective) case, an input can be divided into two bands, where the low and high
are then rotated at opposite angles. The result will be an output where the image
rotates across the stereo stage as frequency increases. For this example, our
perception will likely be that of a split and un-fused image, rather than one that has
broadened. However, with enough frequency dependent rotation stages (on the
order of critical bands), the image will appear as fused, and elements in the image
will take on additional width. Particularly for pan potted mono sounds, these can be
broadened from simple point images to those fleshed out with a sense of body.
Orban’s (1970) initial implementation uses a method resulting in filters where
phase is unmatched between L and R channels.10 It is possible to argue such an
arrangement may be desirable, as the goal is to broaden the width of an input image
and phase mismatch can add to a sense of broadness. Arguments have been made
against this particular approach, due to the resulting increase in the phasiness of the
image. Though, the author has found Orban’s method to be both practical, due to
ease of implementation, and with care, suitable for a variety of sound material.
10 Orban’s initial implementation is limited to mono inputs. However, with the
addition of rotation, Gerzon (1992) has adapted Orban’s method for stereo inputs.
Joseph Anderson 23
Computer Music Journal February 10, 2008
Gerzon (1992; 1997) has demonstrated a number of alternative network
architectures, each set to address the phasiness aspect of Orban’s approach. Perhaps
the cleverest of these involves the use of a unitary feedback network to create a
phase matched frequency dependent rotation network. The interested reader is
directed to Gerzon’s discourse on the matter.
Other transforms
Certainly, the imagination is the limit for creative frequency dependent
applications. The same could be said for more remedial or corrective tasks. This class
of efforts may often be found at a mastering stage or post location recording, where
a stereo mix has resulted, but where the original elements are no longer available to
build a new stereo mix. As an example, imagine a stereo recording with a singer in
the centre of the stereo stage and a high hat cymbal that has been placed at an
azimuth in the stereo panorama that is now regarded as inappropriate or incorrect.
A first thought might suggest the use of frequency dependent rotation, splitting the
image into high and low, or even better, a crossover network selecting out only the
frequency range occupied by the cymbal. The selected range could then be rotated to
the desired azimuth, correcting the position of the cymbal in the resulting stereo
image. Further reflection on this approach will suggest the above algorithm may not
be ideal. Very likely, the selected frequency band will also include sibilance from the
singing voice. When the cymbal isn’t playing, masking the voice, the fricatives of the
voice will likely appear separated, and in our case, moved off centre from the rest of
the voice. So, while this intervention would correct the cymbal, it would introduce
another problem for the voice.
With more thought on the desired outcome, a more suitable approach may be
taken. In detail, the problem is to reposition the cymbal, as selected by frequency,
while retaining the position of the sibilance of the singer in the centre. Reviewing the
Joseph Anderson 24
Computer Music Journal February 10, 2008
transforms discussed, the task described matches that performed by asymmetry.
Rather than rotate the selected band, asymmetry can be used, moving the cymbal
while keeping the vocal sibilance centred and in place. Implementation of the
frequency selection network will be left to the reader, but a few options are
described. The simplest of these is to use an amplitude complementary, but non-
phase matched, band-pass band-stop filter pair. Depending on the amount of
asymmetry applied, the additional phasiness may be minimal and quite acceptable.
As a second strategy, one might choose a Linkwitz-Riley approach, with band-pass
band-stop filter pairs arranged in a Linkwitz-Riley manner. Such a network will not
induce phasiness, so may be preferable in some circumstances, and may be
considered a more general solution. More speculatively, and though the author has
not attempted to do so, it may be possible to implement the desired algorithm as a
unitary feedback network. While not the simplest of choices, as unitary feedback
networks are not particularly intuitive, this approach could result in an elegant
solution, leading to a non-phasey outcome with smooth and balanced changes to
imaging across frequency.
As one would expect, the variety of imaging algorithms discussed above may be
adapted and modified via a variety of frequency dependencies, resulting in a wide
array of creative and remedial imaging opportunities for the audio practitioner.
While just a few have been discussed in this section, many more are possible, and
indeed, desirable. A careful examination and consideration of the imaging
adjustments available with each of the sine-cosine transforms in and of themselves
can, with the addition of frequency selection, lead the reader to new tools and new
approaches.
Conclusions
Aside from rotation itself, we’ve seen the classic stereo imaging transforms
Joseph Anderson 25
Computer Music Journal February 10, 2008
explored here are merely modifications of the rotate transform (sine-cosine panning
for stereo) in one domain (LR or MS) or the other. Width and balance are equivalent
across domains: width rotates the LR axes together, while balance does the same for
the MS axes. M-pan and asymmetry have a correspondence to l-pan and r-pan: m-
pan and asymmetry rotate only the M axis and the S axis, respectively, while l-pan
and r-pan act on the L and R axes as named. Furthermore, the single axis transforms
(m-pan, asymmetry, l-pan and r-pan) are, as would be expected, closely related to
the dual axes transforms (width, balance). M-pan and asymmetry can be thought of
as two versions of balance, bringing at their extremes the MS axes together. The
difference, in the final imaging, has to do with the final azimuth positions of the axes
in the resulting image. For example, balance with
!
"
b
equal to +45° will result in an
image similar, but for resulting azimuth, to asymmetry with
!
"
a
equal to -90°. Both
will return the input L with a gain of +3dB; however, balance keeps L in place, while
asymmetry has kept M in place, resulting in L in the middle of the image. M-pan
with
!
"
m
equal to +90° results in L at the S axis. The same relationships hold true for
width and l-pan and r-pan; and, we’ve mentioned that in practice, width (to narrow
only) is often implemented by users of hardware or software stereo mixers through
panning the left and right channels independently.
As equations for each of the transforms have been presented, the reader is
offered the opportunity to implement, audition and incorporate these powerful tools
into his or her own creative audio practice, gaining control of some of the important
spatial attributes encoded in a stereo signal. And while it hasn’t been investigated in
detail here, the very useful notion of direct-ability of the basic imagers has been
introduced, ideally inspiring further experimentation and exploration. Similarly, a
few instances of frequency dependent applications have been touched upon,
suggesting possibilities for many more imaginative and remedial interventions on a
stereo image. Likewise and too numerous to mention in detail, further modifications
Joseph Anderson 26
Computer Music Journal February 10, 2008
(e.g., amplitude dependency, regular or irregular modulation, amplitude and/or
frequency dependent modulation, etc.) can be added to create a wide variety of
interesting and ear-catching spatial effects. All this from the humble sine-cosine
panning law!
Reference:
Blumlein, A. D. 1931. “Improvements in and relating to Sound-transmission, Sound-
recording and Sound-reproducing Systems.” British Patent 394,325. Reprinted in
J. Eargle (ed.) Stereophonic Techniques, 32-40. New York: Audio Engineering
Society, 1986.
Bohn, D. 2005. “Linkwitz-Riley Crossovers: A Primer.” RaneNote 160. Rane
Corporation.
http://www.rane.com/pdf/ranenotes/Linkwitz%20Riley%20Crossovers%20Pri
mer.pdf (accessed November 6, 2007).
Clark, H. A. M., G. F. Dutton, and P. B. Vanderlyn. 1958. “The ‘Stereosonic’
Recording and Reproducing System.” Journal of the Audio Engineering Society 6(2):
102-117.
Dhomont, F. 1996. Notes for Lés derives du signe, “Acousmatic, What is it?” 24-26.
Montréal: empreintes DIGITALes, IMED 9608. CD.
Dooley, W. L. and , R. D. Streicher. 1982. “M-S Stereo: A Powerful Technique for
Working in Stereo.” Journal of the Audio Engineering Society 30(10): 707-718.
Gerzon, M. A. 1976a. “Blumlein Stereo Microphone Technique.” Journal of the Audio
Engineering Society 24(1): 36, 38.
Gerzon, M. A. 1976b. “Unitary (energy-preserving) multichannel networks with
feedback.” Electronic Letters 12(11): 278-279.
Gerzon, M. 1986. “Stereo Shuffling. New Approach—Old Technique.” Studio Sound
28(7): 122-130.
Joseph Anderson 27
Computer Music Journal February 10, 2008
Gerzon, M. A. 1990. “Fixing It Outside The Mix.” Studio Sound 32(9): 78, 81, 82, 85,
86, 88, 90 & 93.
Gerzon, M. A. 1992. “Signal Processing for Simulating Realistic Stereo Images.”
Preprint No. 3423, 93rd Convention of the Audio Engineering Society, October 1-4, San
Francisco, USA.
Gerzon, M. A. 1994. “Applications of Blumlein Shuffling to Stereo Microphone
Techniques.” Journal of the Audio Engineering Society 42(6): 435-453.
Gerzon, M. A. 1997. “Stereophonic Signal Processor.” United States Patent 5,671,287.
Griesinger, D. 1986. “Spaciousness and Localization in Listening Rooms and Their
Effects on the Recording Technique.” Journal of the Audio Engineering Society
34(4): 255-268.
Griesinger, D. 2002. “Stereo and Surround Panning in Practice,” Preprint No. 5564,
112th Convention of the Audio Engineering Society, May 10-13, Munich, Germany.
Harwood, H. D. 1968. “Stereophonic Image Sharpness.” Wireless World 74(July): 207-
211.
Julstrom, S. 1991. “An Intuitive View of Coincident Stereo Microphones.” Journal of
the Audio Engineering Society 39(9): 632-649.
Kendall, G. S. 1995. “The Decorrelation of Audio Signals and Its Impact on Spatial
Imagery.” Computer Music Journal 19(4): 71-87.
Lipshitz, S. P. 1986. “Stereo Microphone Techniques: Are the Purists Wrong?”
Journal of the Audio Engineering Society 34(9): 719-744.
Lipshitz, S. P., D. Griesinger, and M. A. Gerzon. 1987. “Comments on ‘Spaciousness
and Localization in Listening Rooms and Their Effects on the Recording
Technique’ and ‘Stereo Shuffling. New Approach—Old Technique’ and Authors’
Replies” Journal of the Audio Engineering Society 35(12): 1013-1014.
Malham, D. G. 1998. “Approaches to spatialisation.” Organised Sound 3(2): 167-177.
Orban, R. 1970. “A Rational Technique for Synthesizing Pseudo-Stereo from
Joseph Anderson 28
Computer Music Journal February 10, 2008
Monophonic Sources.” Journal of the Audio Engineering Society 18(2): 157-164.
Lord Rayleigh (J.W. Strutt, 3rd Baron of Rayleigh). 1907. “On our perception of sound
direction.” Philosophical Magazine 13: 214–232.
Rumsey, F. 2002. “Spatial Quality Evaluation for Reproduced Sound: Terminology,
Meaning, and a Scene-Based Paradigm.” Journal of the Audio Engineering Society
50(9): 651-666.
Windsor, L. 2000. “Through and around the acousmatic: the interpretation of
electroacoustic sounds.” In Music, Electronic Media and Culture, ed. S. Emmerson,
7-35. Aldershot, England: Ashgate Publishing.
Joseph Anderson 29
Computer Music Journal February 10, 2008
Figures
Figure 1. LR and MS domain, with signals panned to +30
°
, 0
°
and -45
°
.
Figure 2. Stereo image with signals panned 0° to -15°, at +15° increments.
Joseph Anderson 30
Computer Music Journal February 10, 2008
Figure 3. Rotate +7.5°.
Figure 4. Rotate +15°.
Joseph Anderson 31
Computer Music Journal February 10, 2008
Figure 5. Rotate +30°.
Figure 6. Width -30°.
Joseph Anderson 32
Computer Music Journal February 10, 2008
Figure 7. Width +15°.
Figure 8. Width +30°.
Joseph Anderson 33
Computer Music Journal February 10, 2008
Figure 9. Balance +15°.
Figure 10. Balance +30°.
Joseph Anderson 34
Computer Music Journal February 10, 2008
Figure 11. Balance -30°.
Figure 12. M-pan +15°.
Joseph Anderson 35
Computer Music Journal February 10, 2008
Figure 13. M-pan +30°.
Figure 14. Asymmetry +30°.
Joseph Anderson 36
Computer Music Journal February 10, 2008
Figure 15. Asymmetry +60°.
Figure 16. R-pan +15°.
Joseph Anderson 37
Computer Music Journal February 10, 2008
Figure 17. R-pan +30°.
Figure 18. L-pan -15°.
Joseph Anderson 38
Computer Music Journal February 10, 2008
Figure 19. L-pan -30°.
Figure 20. Adding direct-ability to an imaging transform.
Figure 21. Adding frequency dependency to an imaging transform.
... Stereo panning aims to transform a set of monaural signals into a two-channel signal in a pseudostereo field [1]. Many methods and panning ratios have been proposed, the most common one being the sine-cosine panning law [2,3]. In stereo panning the ratio at which its power is spread between the left and the right channels determines the position of the source. ...
... Once the appropriate panning position was determined, a sine-cosine panning law [2] was used to place sources in the final sound mix: ...
Article
Full-text available
A real-time semiautonomous stereo panning system for music mixing has been implemented. The system uses spectral decomposition, constraint rules, and cross-adaptive algorithms to perform real-time placement of sources in a stereo mix. A subjective evaluation test was devised to evaluate its quality against human panning. It was shown that the automatic panning technique performed better than a nonexpert and showed no significant statistical difference to the performance of a professional mixing engineer.
Article
When two loudspeakers play the same signal, a "phantom center" image is produced between the speakers. However, this image differs from one produced by a real center speaker. In particular, acoustical crosstalk produces a comb-filtering effect, with cancellations that may be in the frequency range needed for the intelligibility of speech. We present a method for using phase decorrelation to fill in these gaps and produce a flatter magnitude response, reducing coloration and potentially enhancing dialogue clarity. This method also improves headphone compatibility, and it reduces the tendency of the phantom image to move toward the nearest speaker.
Article
Full-text available
This paper discusses signal processing techniques that create decorrelated signals through direct means. An elaborated discussion is given on the five categories of perceptual effects: the timbral coloration and combing; the diffuse sound fields; the externalization in headphone reproduction; the lack of image shift; and the failure of precedence effect. A related signal processing technique for controlling the perceived distance of a sound image in near-field loudspeaker reproduction is also presented.
Article
Newly defined two- and three-dimensional 'stereo polar diagrams' of MS (mid, side) and XY (crossed) coincident stereo microphones graphically illustrate both the total combined left and right sensitivity to sounds arriving from various directions, and where in the reproduced stereo stage the sounds appear. The three-dimensional stereo polar diagrams also yield other useful characteristics relating in particular to the pickup of reverberation, including the stereo distance factor and the directional encoding of reverberation within the reproduced stereo stage.
Article
Between 1931 and 1934 Blumlein proposed using a pair of identical ear-spaced microphones pointing in the same direction with a `shuffling' network to create stereo. Although Blumlein shuffled stereo techniques have never been used commercially, they have unique advantages. Various implementations an used are described, including improved results from dummy-head and Theile sphere microphones and used with boundary-layer, cardioid and shotgun microphones.
Article
A frequency-dependent linear audio signal processor takes source signals S in input signals and provide directionally spread directionally encoded output signals. The processor directionally encodes with constant gain magnitude frequency components of the source signal S to-and-fro across a predetermined directional stage P" as frequency increases such that at least three predetermined positions within the stage P", the directional encoding has substantially zero perceived phasiness. The processor may be a frequency-dependent rotation matrix for stereo input signal and may be a unitary network using a feedback path around parallel identical all-pass networks in series with a rotation matrix and a feedforward path bypassing the all-pass networks. Successive frequencies of positioning of source signal S at a predetermined position P within the stage P" are preferably spaced approximately uniformly on a logarithmic or Bark Frequency scale. Several sources S may have individually adjustable spreads while sharing common processor.
Article
Low-frequency localization and spaciousness are altered by conventional loudspeakers and positions, especially in small (40 m**3 or less) listening rooms. Many common loudspeaker placements produce poor spacious and reduced low-frequency separation when excited by either coincident or pan-potted recordings. When this effect is corrected electronically by increasing separation at low frequencies, coincident microphone techniques produce superior sounding recordings in a wide variety of playback environments.
Article
Spatial quality in reproduced sound is a subset of the broad topic of sound quality. In the past it has been studied less rigorously than other aspects of reproduced sound quality, leading to a lack of clarity in standard definitions of subjective attributes. Rigor in the physical measurement of sound signals should be matched by equal rigor in semantics relating to subjective evaluation. A scene-based paradigm for the description and assessment of spatial quality is described, which enables clear distinctions to be made between elements of a reproduced sound scene and will assist in the search for related physical parameters.
Article
This article describes some of the basic principles of acoustics and psychoacoustics related to the spatialisation of sound. It introduces recording and diffusion technologies, including binaural, stereo and surround-sound techniques.