Content uploaded by Archontis Politis

Author content

All content in this area was uploaded by Archontis Politis on Sep 30, 2016

Content may be subject to copyright.

JSAmbisonics: A Web Audio library for interactive spatial sound

processing on the web

ARCHONTIS POLITIS

Dept. of Signal Processing and Acoustics, Aalto University, Espoo, Finland

e-mail: archontis.politis@aalto.ﬁ

DAVID POIRIER-QUINOT

IRCAM, Paris, France

e-mail: david.poirier-quinot@ircam.fr

September 23rd 2016.

Abstract

This paper introduces the JSAmbisonics library, a set of JavaScript modules based on the Web Audio

API for spatial sound processing. Deployed via Node.js, the library consists of a compact set of tools for

reproduction and manipulation of ﬁrst- or higher-order recorded or simulated Ambisonic sound ﬁelds.

After a brief introduction to the fundamentals of Ambisonic processing, the main components (encoding,

rotation, beamforming, and binaural decoding) of the JSAmbisonics library are detailed. Each compo-

nent, or “node”, can be used on its own or combined with others to support various application scenarios,

discussed in Section 4. An additional library developed to support spherical harmonic transform oper-

ations is introduced in Section 3.2. Careful consideration has been given to the overall computational

eﬃciency of the JSAmbisonics library, particularly regarding spatial-encoding and decoding schemes,

optimized for real-time production and delivery of immersive web contents.

1 Introduction

The emerging technological trends on delivery of au-

diovisual content are currently targeting increased

immersion. After the increase in bandwidth and com-

putational power that made delivery of high quality

audio and video content on devices such as smart-

phones possible, making this content immersive is

considered a requirement in order to provide a leap in

user experience compared to the traditional modes of

enjoying audiovisual content. Virtual and augmented

reality technology has also resurfaced, targeting mo-

bile platforms and seemingly closer to large scale de-

ployment. Spatial sound is a fundamental component

of these immersive technologies.

Eﬀective spatial sound tools for creation of immer-

sive content are well known from an audio engineering

point of view; panning tools for loudspeakers, binau-

ral ﬁlters for headphones, reverberation and decorre-

lation for a sense of space. One approach to spatial

scene description and generation is to deﬁne all in-

dividual sound sources and environment along with

their spatialization parameters, an approach termed

object-based spatial audio. An alternative is to con-

sider a scene-based description, in which the audio

signals describe a full sound scene. Such a represen-

tation has certain advantages over the object-based

approach, as long as the format is adequate to re-

produce the sound scene with high perceptual qual-

ity and there is no intention of re-mixing the scene

components at the client side. Such advantages are

lower transmission requirements, compared to the

high number of object channels, eﬃcient implemen-

tation of scene eﬀects, such as rotations, and direct

mixing with recorded sound scenes.

Ambisonics [1,2,3,4] is such a method with the

main advantage that it oﬀers a canonical and hier-

archical representation of the spatial sound scene,

Ambisonic processing on the web Politis

and it is computationally eﬃcient. Ambisonics treats

synthetic and captured sound scenes in a common

framework, which makes them especially suitable for

spherical audio recording in conjunction with spheri-

cal video. Furthermore, it provides a suitable method

for rendering to headphones with a combination of

ambisonic theory and binaural ﬁlters, and suitable

tools for rotations and manipulations of the scene.

This paper presents an Ambisonics audio library

that utilizes the Web Audio API (WAA) [5] for inter-

active spatial sound processing on the web [6]. That

makes the library useful for spatial sound creation on

any modern browser that supports WAA. The library

is written in JavaScript (JS) and is easy to use and

incorporate in a web application. Special eﬀort has

been given in making the library comprehensive and

extensible. The library supports Higher-order Am-

bisonics (HOA) of arbitrary order and implements

most fundamental ambisonic processing blocks for

generating and reproducing a sound scene. These

operations, their implementation and potential ap-

plications are presented below.

2 Ambisonics background

2.1 Sound scene description in Am-

bisonics

Assuming that all sound sources are on the far-ﬁeld,

a general sound scene can be described as a continu-

ous distribution of plane waves with spatio-temporal

amplitude a(t, γ) for a plane wave incident from di-

rection γ= [cos φcos θ, sin φcos θ, sin θ]T, with (φ, θ)

being the azimuth and elevation angle respectively.

By taking the spherical harmonic transform (SHT) of

the amplitude density, we arrive at the ambisonic de-

scription of the sound scene, encoded into the SH co-

eﬃcients of the amplitude density a, or equivalently,

the ambisonic signals

a(t) = SHT {a(t, γ)}=Zγ

a(γ)y(γ) dγ,(1)

where Rγdγdenotes integration over the surface of

the unit sphere, and dγ= cos θdθdφis the diﬀeren-

tial surface element. The basis vector y(γ) contains

all SHs up to a speciﬁed maximum order N. For a

SHT of order N, there are M= (N+ 1)2SHs and

ambisonic signals. Following established HOA con-

ventions, real SHs are used and deﬁned as

Ynm(θ, φ) = s(2n+ 1)(n− |m|)!

(n+|m|)!Pn|m|(sin θ)ym(φ),

(2)

with

ym(φ) =

√2 sin |m|φ m < 0,

1m= 0,

√2 cos mφ m > 0,

(3)

and Pnm the associated Legendre functions of degree

n. The SHs are orthonormal with

Zγ

y(γ)yT(γ) dγ= 4πI(4)

where Iis the M×Midentity matrix. Using this

power normalization, the 0th order ambisonic signal

a00 is equivalent to an omnidirectional signal at the

origin.

The most commonly used ordering of SHs in most

scientiﬁc ﬁelds and, consequently, the ambisonic sig-

nals, is

[y(γ)]q=Ynm(γ),with q= 1,2, ..., (N+ 1)2

and q=n2+n+m+ 1.(5)

From the index qthe mode number (n, m) can be re-

covered as n=b√q−1cand m=q−n2−n−1.

In HOA literature, the ordering of Eq. 5is known as

ACN ambisonic channel ordering, and the normal-

ization of Eq. 2&4as N3D normalization.

2.2 Ambisonic encoding

Encoding of a plane wave source carrying a signal

s(t), incident from γ0, to ambisonic signals is given

by

a(t) = s(t)y(γ0),(6)

so that multiple signals for Ksources can be encoded

as

a(t) =

K

X

k=1

sk(t)y(γk).(7)

2.3 Ambisonic rotation

Rotation of the sound scene can be conveniently per-

formed in the SHD by applying a SH rotation matrix

to the ambisonic signals. More speciﬁcally, for a ro-

tation of the coordinate system given by the three

Euler angles α, β, γ , the signals of the rotated scene

are given by

arot

n(t) = Mrot

n(α, β, γ )an(t),with n= 1,2, ..., N

(8)

where an= [an(−n), ..., ann]Tdenotes the ambisonic

signals of order n, and Mrot

nis an (n+ 1)2×(n+ 1)2

rotation matrix for the certain order. Semi-closed

form solutions for the rotation matrices exist only for

Interactive Audio Systems Symposium, September 23rd 2016, University of York, United Kingdom. 2

Ambisonic processing on the web Politis

complex SHs, and they are too computationally de-

manding to compute for real-time applications. How-

ever, fast recursive algorithms exist for rotation of

real SHs, that are eﬃcient and suitable for ambisonic

processing. [7,8].

2.4 Ambisonic reﬂection

Reﬂection, or mirroring, of the sound scene along the

principal planes of yz (front-back), xz (left-right), or

xy (up-down) becomes a trivial operation in the SHD

due to symmetry properties of the SHs. As SHs are

either symmetric or antisymmetric with respect to

these planes, the ambisonic signals either remain the

same under reﬂection (for symmetric SHs) or are in-

verted (for antisymmetric SHs). Hence, reﬂection re-

duces to inverting the polarity of speciﬁc sets of am-

bisonic signals, depending on the reﬂection plane:

(m < 0∧meven) ∪(m≥0∧modd) : yz (9)

m < 0 : xz (10)

(n+m) odd : xy.(11)

2.5 Ambisonic beamforming

Beamforming in the SHD reduces to a weight-and-

sum operation of the SH signals. In ambisonic litera-

ture SH beamforming has been traditionally termed

avirtual microphone. In case the directional pattern

of the virtual microphone is axisymmetric, which is

usually the case of interest, the virtual microphone

signal xvm(t) is given by

xvm(t, γ0) = wT(γ0)a(t) (12)

where γ0is the orientation of the virtual micro-

phone, and w(γ0) the (N+1)2vector of beamforming

weights. The weight vector follows the ordering of the

SHs, and can be expressed as a pattern-dependent

part and a rotation-dependent part as

[w(γ0)]q=wnm =cnYnm(γ0).(13)

The (N+ 1) coeﬃcients cnare derived according to

desired properties of the virtual microphone; some

patterns of interest are presented below.

2.6 Ambisonic decoding

2.6.1 Loudspeaker decoding

The ambisonic signals can be distributed to a play-

back setup through a decoding mixing matrix, a pro-

cess termed ambisonic decoding. Commonly, this

decoding matrix is frequency-independent, especially

in HOA. Its design can be performed according to

physical or psychoacoustical criteria. The signals

xls = [x1, ..., xL] for Lloudspeakers are then obtained

by

xls(t) = Dls a(t) (14)

where Dls is the L×(N+ 1)2decoding matrix.

Some straightforward designs for the decoding

matrix are the following:

Sampling :Dls =1

LYT

L(15)

Mode −matching :Dls = (YT

LYL+β2I)−1YT

L(16)

ALLRAD :Dls =1

Ntd

GtdYT

td (17)

where YL= [y(γ1), ..., y(γL)] is the (N+ 1)2×L

matrix of SHs at the loudspeaker directions. In the

mode-matching approach, the least-squares solution

is usually constrained with a regularization value β.

In the ALLRAD method [4], Ytd = [y(γ1), ..., y(γT)]

is the matrix of SHs at the Ntd directions of a uniform

spherical t-design [9], of t≥2N+ 1, while the Gtd is

an L×Ntd matrix of vector-amplitude panning gains

(VBAP) [10], with the t-design directions considered

as virtual sources.

2.6.2 Binaural decoding

Ambisonics are suitable for headphone reproduc-

tion, by integrating head-related transfer functions

(HRTFs). As HRTFs are frequency-dependent, so

are the decoding matrices in this case. More speciﬁ-

cally, the binaural signals xbin = [xL, xR]Tare given

by

xbin(f) = Dbin (f)a(f) (18)

with Dbin being the 2 ×(N+ 1)2decoding matrix.

In the time domain, Eq. 18 translates to a sum of

convolutions as

xbin(t) =

(N+1)2

X

q=1

dL

q(t)∗aq(t)

(N+1)2

X

q=1

dR

q(t)∗aq(t)

(19)

where (∗) denotes convolution and dL

q(t) =

IFT {[Dbin]1,q (f)}is the ﬁlter derived from the in-

verse Fourier transform of the q-th entry of the decod-

ing matrix for the left ear, and similarly for the right.

Hence, in the general case 2 ×(N+ 1)2convolutions

are required for binaural decoding.

There are two ways to derive the decoding matrix

coeﬃcients, or equivalently the ﬁlters. The direct ap-

proach takes advantage of the Parseval’s theorem for

Interactive Audio Systems Symposium, September 23rd 2016, University of York, United Kingdom. 3

Ambisonic processing on the web Politis

the SHT, which for a sound distribution a(f, γ) and

e.g. the left HRTF hL(f, γ) states that

xL(f) = Zγ

a(f, γ)hL(f , γ) dγ

=SHT {a(f, γ)} · SHT {hL(f , γ)}

=hT

L(f)a(f).(20)

where hLare the coeﬃcients of the SHT applied on

the HRTF. The above Eq. 20 states that the bin-

aural signals are the result of the inner product be-

tween the ambisonic coeﬃcients and the SH coeﬃ-

cients of the HRTFs. Hence the decoding matrix in

this case is Dbin(f) = [hL(f),hR(f)]T. Expansion

of HRTFs into SH coeﬃcients has been researched

extensively, mainly in the context of HRTF interpo-

lation [11,12,13].

The second way, and the one seen more often

in literature [14,15], is the virtual loudspeaker ap-

proach, in which plane wave signals are decoded with

a decoding matrix of preference Dvls, covering the

sphere adequately, and then consequently convolved

with the HRTFs for the decoding directions. The

number Kof decoding directions is selected to be

high enough for the order of the available ambisonic

signals, with K > (N+ 1)2. Formulated in the fre-

quency domain, the virtual loudspeaker approach be-

comes

xbin(f) = HLR (f)Dvlsa(f) = Dbin (f)a(f),(21)

where

HLR =

hL(f, γ1)hR(f , γ1)

... ...

hL(f, γK)hR(f , γK)

T

(22)

is the matrix of HRTFs for the decoding directions.

Note that the ﬁnal ambisonic decoding matrix Dbin =

HLRDvls is again of size 2 ×(N+ 1)2, no matter the

number of decoding directions K.

If it is assumed that the left and right HRTFs

are antisymmetric with respect to the median plane

(termed here as xz-antisymmetry), e.g. when non-

personalized HRTFs are applied, then what the right

ear would capture is similar to the left ear signal if the

sound scene distribution was mirrored with respect

to the median plane. Such mirroring corresponds to

Eq. 10. In practice, that means that only the (N+1)2

left-ear HRTF ﬁlters need to be applied to derive both

ear signals. Any of the two methods presented above

can be used for computing the ﬁlters. Assuming two

intermediate signals M(t) and S(t) with

M(t) = X

q|m≥0

dL

q(t)∗aq(t)

S(t) = X

q|m<0

dL

q(t)∗aq(t) (23)

the binaural signals can be derived simply by

xbin(t) = M(t) + S(t)

M(t)−S(t).(24)

This formulation is of practical importance for real-

time applications since it reduces the required num-

ber of convolutions by half. This fact has been noted

in literature with the virtual loudspeaker approach,

assuming antisymmetric arrangements [15]. It can

also be seen, however, from a purely ambisonic per-

spective as shown above.

3 Implementation

3.1 Web Audio API

WAA contains all signal processing elements that per-

mit the realization of ambisonic processing. More

speciﬁcally, since they are all either frequency in-

dependent or frequency-dependent linear processes,

they can be realized with gain factors, convolutions

and summations on the ambisonic signals. In WAA

fundamental signal processing blocks are called Au-

dio Nodes. Three such audio nodes are used in the

implementation of all ambisonic processing blocks.

The ﬁrst is the Gain Node, a simple signal multiplier

with user-controlled gain at runtime. The second is

a convolution block, the Convolver Node, which per-

forms linear convolution with user-speciﬁed FIR ﬁl-

ters. This block is utilized for the convolutions in

the binaural decoding stage. Finally, the (N+ 1)2

channels for a speciﬁed order are grouped into single

streams when sent from an ambisonic block to an-

other, by using the Channel Merger Node, and split

again into the constituent channels using Channel

Splitter Node when received from an ambisonic block,

to be processed.

Vector and matrix operations on the ambisonic

signals are realized with groups of gain nodes and by

summing appropriately the resulting channels. An

alternative to this can be the Audio Worker Node, in

which JS code is applied directly on the audio buﬀers.

However, the built-in gain nodes handle the fast up-

dating of values during runtime without artifacts, and

the beneﬁt of an audio worker implementation is ex-

pected to be small if any. An implementation and

comparison of such an approach is planned as future

work.

Interactive Audio Systems Symposium, September 23rd 2016, University of York, United Kingdom. 4

Ambisonic processing on the web Politis

3.2 JS Spherical Harmonic Transform

library

Since there is no existing JS library for the spherical

harmonic operations involved in ambisonic process-

ing, a custom made one was created for this project

[16]. The library performs the following basic opera-

tions:

•Computation of all associated Legendre func-

tions up to a maximum degree N, for sets of

points, using fast recursive formulas [17].

•Computation of all real SHs up to a speciﬁed

order N, for sets of directions.

•Computation of the forward SHT, using either

a direct weighted-sum approach of the data

points, or by a least-squares approach. The

transform returns a vector of SH coeﬃcients.

•Computation of the inverse SHT at an arbitrary

direction, using the SH coeﬃcients from the for-

ward transform.

•Computation of rotation matrices in the SHD,

using the fast recursive solution of [7] for real

SHs .

Applications of JSHT are not limited only to Web

audio and ambisonics. Graphics and scientiﬁc appli-

cations that beneﬁt from a spherical spectral repre-

sentation can use it for demonstrative purposes de-

ployed on the Web. Spherical interpolation of direc-

tional data is such an example.

3.3 JS Ambisonics library

The WAA Ambisonics library implements a set of

audio processing blocks that realize most of the fun-

damental operations presented in Sec. 2. SH com-

putations are performed internally using the JS SHT

library described above. All ambisonic processing fol-

lows the ACN/N3D convention. However, a number

of blocks are provided for converting other channel

and normalization conventions to this speciﬁcation.

All ambisonic blocks expose an in and out node, that

can be used for WAA style of connecting audio blocks.

Furthermore, they expose some properties and meth-

ods that can be updated during real-time, for interac-

tive operation. For a detailed documentation of the

object properties the reader is referred to [6].

3.3.1 Encoding, Rotation & Mirroring

The monoEncoder object takes a monophonic sound

stream and encodes it in an ambisonic stream of a

user-speciﬁed order, and at a user speciﬁed direction,

using Eq. 6. The source direction can be updated

interactively at runtime.

The sceneRotator object takes an ambisonic

stream of a certain order and returns the stream

of the same order for a rotated sound scene. The

scene rotation is given in yaw-pitch-roll convention.

To avoid redundant computations, the ambisonic sig-

nals of each order nare multiplied only with the

rotation matrix Mrot

nof that order, as shown in

Eq. 8. The sceneMirror object implements mirror-

ing through the polarity inversions of Eq. 9–11. Both

rotation and mirroring can be updated interactively.

3.3.2 Virtual Microphones

The virtualMic object implements an ambisonic

beam former of a user-speciﬁed type and orientation.

The block implements Eq. 12, with the following op-

tions controlling the type of a virtual microphone of

order Nthrough the coeﬃcients cnof Eq. 13:

cardioid : cn=N!N!

(N+n+ 1)!(N−n)! (25)

hypercardioid : cn=1

(N+ 1)2(26)

max −rE : cn=Pn(cos κN)

PN

n=0(2n+ 1)Pn(cos κN)

(27)

with κN= cos (2.407/(N+ 1.51)) as given in [4].

Higher-order cardioids are deﬁned as a normal car-

dioid raised to the power of N. Higher-order hyper-

cardioids maximize the directivity factor for a given

order; in spherical beamforming literature also known

as regular or plane-wave decomposition beamformers.

The max-rE pattern originates from ambisonic lit-

erature and maximizes the acoustic intensity vector

in an isotropic diﬀuse ﬁeld. Apart from the above,

higher-order supercardioids are also implemented up

to 4th order with the coeﬃcients converted appropri-

ately from [18]. Supercardioids maximize the front-

to-back power ratio for a given order.

3.3.3 Conversion between formats

All operations are internally performed using the

ACN/N3D speciﬁcation. However, the vast major-

ity of recorded ambisonic material is ﬁrst order, and

it follows the traditional B-format speciﬁcation of

WXYZ channel ordering. Conversion from this spec-

iﬁcation to ACN/N3D can be expressed by the con-

Interactive Audio Systems Symposium, September 23rd 2016, University of York, United Kingdom. 5

Ambisonic processing on the web Politis

version matrix

xACN/N3D =

√2 0 0 0

0 0 √3 0

0 0 0 √3

0√3 0 0

xWXYZ (28)

Regarding HOA, the ﬁrst existing speciﬁcation is

the Furse-Malham (FuMa) one [19], deﬁned up to

third order. Conversion from WXYZ or FuMa

to ACN/N3D can be performed with the convert-

ers.bf2acn and converters.fuma2acn objects respec-

tively. Note that the ﬁrst-order speciﬁcation of FuMa

is the same as the traditional WXYZ one.

Recent HOA research and technology uses the

ACN ordering scheme as the standard. However,

in terms of SH normalization there are two popu-

lar schemes, the orthonormal N3D, which is used

throughout this library, and the Schmidt semi-

normalized one, known as SN3D in ambisonic litera-

ture. Conversion between the two is trivial and given

by

xnm|SN3D =xnm|N3D/√2n+ 1 (29)

xnm|N3D =√2n+ 1xnm|SN3D.(30)

Conversion between the two speciﬁcations can be per-

formed with the blocks converters.n3d2sn3d and con-

verters.sn3d2n3d.

3.3.4 Acoustic Visualization

It is possible to extract information from the am-

bisonic signals about the directional distribution of

sound in the scene. One such approach is based on

the acoustic active intensity, expressing the net ﬂow

of energy through the notional center of the sound

scene, and the diﬀuseness, expressing the portion of

energy that is not propagating due to either modal

or diﬀuse behavior. These parameters require only

the ﬁrst-order ambisonic signals, which correspond to

acoustic pressure and velocity, see for example [20].

Examples of how diﬀuseness and intensity may be

used for visualizations sound sources in the scene can

be found in the code examples [6]. Their broadband

version can be extracted using the intensityAnalyzer

block, computed at each processing block of WAA.

More reﬁned visualizations can be obtained if the

intensity and diﬀuseness is computed in frequency

bands, e.g. using the biquad ﬁlter structures of WAA.

3.4 Decoding ﬁlter generation and

SOFA integration

Binaural decoding is implemented in the binDecoder

block, paired with both hoaLoader and hrirLoader

blocks that handle user-deﬁned binaural decoding ﬁl-

ters loading.

Using the hoaLoader, users can choose both HRIR

set and decoding approach. An additional Matlab

script based on the Higher-Order-Ambisonics library

[21] is available for oﬄine generation of HOA decod-

ing ﬁlters. Some decoding ﬁlters are already included

in the repository, based on LISTEN HRTF sets [22],

derived using the ALLRAD method of Eq. 17. Both

decoding approaches mentioned in Sec. 2.6.2 were

tested for derivation of decoding ﬁlters. The virtual

loudspeaker approach was found superior in terms of

preserving timbre than the direct approach of Eq. 20,

which suﬀered from severe high-frequency loss at

lower orders. Note that an approximate timbre cor-

rection can be applied to counteract this eﬀect, as

proposed in [23].

The hrirLoader block on the other hand allows for

on-the-ﬂy HRIR ﬁlters loading, internally converted

to HOA decoding ﬁlters to be used by the binDe-

coder block. The hrirLoader implementation is based

on the HrftSet class of the binauralFIR library [24],

featuring server-based HRIR loading, granting access

to an extensive choice of HRTF sets without clut-

tering the library itself. At the time of writing, the

hrirLoader relies on local JSON embedded HRTF set

loading, awaiting for the IRCAM OpenDAP SOFA

server [25] publication.

4 Applications

The library is relevant to any web application that

delivers or involves immersive content. Some exam-

ples of special interest are highlighted below:

•Reproduction of spherical audio and video for

telepresence. In this scenario an ambisonic au-

dio stream is delivered to the client along with a

spherical video. The audio part is rendered bin-

aurally at the target platform including head-

rotation information, giving a convincing sense

of presence.

•Reproduction of audio-only or audiovisual com-

positions, with the sound part encoded into

a few ambisonic channels using the provided

tools, and broadcasted to multiple clients with

binaural rendering done independently on each

one of them.

•Web VR/AR applications in which the audio

components are updated in real-time and en-

coded into ambisonic streams, avoiding costly

Interactive Audio Systems Symposium, September 23rd 2016, University of York, United Kingdom. 6

Ambisonic processing on the web Politis

binaural rendering of multiple sources and re-

verberation, while still peforming rotation of

the sound scene.

•Web video games with immersive spatial sound.

•Interactive visualization driven by spatial prop-

erties of the sound scenes, for extracting acous-

tic information or for artistic uses.

Some basic examples highlighting some of these

applications are included in the code repository [6].

References

[1] M. A. Gerzon, “Periphony: With-height sound

reproduction,” Journal of the Audio Engineering

Society, vol. 21, no. 1, pp. 2–10, 1973.

[2] S. Moreau, S. Bertet, and J. Daniel, “3D sound

ﬁeld recording with higher order ambisonics –

objective measurements and validation of spher-

ical microphone,” in 120th Convention of the

AES, (Paris, France), 2006.

[3] M. A. Poletti, “Three-dimensional surround

sound systems based on spherical harmon-

ics,” Journal of the Audio Engineering Society,

vol. 53, no. 11, pp. 1004–1025, 2005.

[4] F. Zotter and M. Frank, “All-round ambisonic

panning and decoding,” Journal of the Audio

Engineering Society, vol. 60, no. 10, pp. 807–820,

2012.

[5] W3C, “Web Audio API,” 12 2015. https:

//www.w3.org/TR/webaudio/.

[6] A. Politis and D. Poirier-Quinot, “JSAmbison-

ics: A Web Audio library for interactive spa-

tial sound processing on the web.” https://

github.com/polarch/JSAmbisonics.

[7] J. Ivanic and K. Ruedenberg, “Rotation matri-

ces for real spherical harmonics. direct determi-

nation by recursion,” The Journal of Physical

Chemistry, vol. 100, no. 15, pp. 6342–6347, 1996.

[8] M. A. Blanco, M. Fl´orez, and M. Bermejo,

“Evaluation of the rotation matrices in the basis

of real spherical harmonics,” Journal of Molec-

ular Structure: THEOCHEM, vol. 419, no. 1,

pp. 19–27, 1997.

[9] R. H. Hardin and N. J. Sloane, “McLaren’s im-

proved snub cube and other new spherical de-

signs in three dimensions,” Discrete & Compu-

tational Geometry, vol. 15, no. 4, pp. 429–441,

1996.

[10] V. Pulkki, “Virtual sound source positioning us-

ing vector base amplitude panning,” Journal of

the Audio Engineering Society, vol. 45, no. 6,

pp. 456–466, 1997.

[11] M. J. Evans, J. A. S. Angus, and A. I. Tew, “An-

alyzing head-related transfer function measure-

ments using surface spherical harmonics,” The

Journal of the Acoustical Society of America,

vol. 104, no. 4, pp. 2400–2411, 1998.

[12] D. N. Zotkin, R. Duraiswami, and N. A.

Gumerov, “Regularized HRTF ﬁtting using

spherical harmonics,” in IEEE Workshop on

Applications of Signal Processing to Audio and

Acoustics (WASPAA), (New Paltz, NY, USA),

2009.

[13] G. D. Romigh, D. S. Brungart, R. M. Stern,

and B. D. Simpson, “Eﬃcient real spherical

harmonic representation of head-related trans-

fer functions,” IEEE Journal of Selected Topics

in Signal Processing, vol. 9, no. 5, pp. 921–930,

2015.

[14] M. Noisternig, T. Musil, A. Sontacchi, and

R. H¨oldrich, “3D binaural sound reproduc-

tion using a virtual ambisonic approach,” in

IEEE Int. Symposium on Virtual Environments,

Human-Computer Interfaces and Measurement

Systems (VECIMS), (Lugano, Switzerland),

2003.

[15] B. Wiggins, I. Paterson-Stephens, and P. Schille-

beeckx, “The analysis of multi-channel sound re-

production algorithms using HRTF data.,” in In

19th Int. Conf. of the AES, 2001.

[16] A. Politis, “A JavaScript library

for the Spherical Harmonic Trans-

form.” https://github.com/polarch/

Spherical-Harmonic-Transform-JS.

[17] E. W. Weisstein, “Associated legendre poly-

nomial.” http://mathworld.wolfram.com/

AssociatedLegendrePolynomial.html.

[18] G. W. Elko, “Diﬀerential microphone arrays,”

in Audio signal processing for next-generation

multimedia communication systems, pp. 11–65,

Springer, 2004.

[19] Blue Ripple Sound, “HOA Technical Notes – B-

format.” http://www.blueripplesound.com/

b-format.

Interactive Audio Systems Symposium, September 23rd 2016, University of York, United Kingdom. 7

Ambisonic processing on the web Politis

[20] A. Politis, T. Pihlajam¨aki, and V. Pulkki, “Para-

metric spatial audio eﬀects,” in Int. Conf. on

Digital Audio Eﬀects (DAFx), (York, UK), 2012.

[21] A. Politis, “Higher Order Ambisonics li-

brary,” 2015. https://github.com/polarch/

Higher-Order-Ambisonics.

[22] O. Warusfel, “Listen HRTF database,” online,

IRCAM and AK, Available: http://recherche.

ircam. fr/equipes/salles/listen/index. html,

2003.

[23] J. Sheaﬀer, S. Villeval, and B. Rafaely, “Render-

ing binaural room impulse responses from spher-

ical microphone array recordings using timbre

correction,” in EAA Joint Symposium on Au-

ralization and Ambisonics, (Berlin, Germany),

2014.

[24] T. Carpentier, “Binaural synthesis with the

Web Audio API,” in 1st Web Audio Conference

(WAC), 2015.

[25] IRCAM, “IRCAM OpenDAP Server. (to be pub-

lished soon)..”

Interactive Audio Systems Symposium, September 23rd 2016, University of York, United Kingdom. 8