Ville Pulkki

Ville Pulkki
Aalto University · Department of Signal Processing and Acoustics

Professor (full)

About

285
Publications
54,501
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,159
Citations
Citations since 2016
107 Research Items
2687 Citations
20162017201820192020202120220100200300400500
20162017201820192020202120220100200300400500
20162017201820192020202120220100200300400500
20162017201820192020202120220100200300400500
Introduction
Ville Pulkki currently works at the Department of Signal Processing and Acoustics, Aalto University. Ville does research in the field of Communication Acoustic. Their current project is 'Parametric Time-Frequency.-Domain Spatial Audio'.

Publications

Publications (285)
Conference Paper
Full-text available
This paper builds upon a recently proposed spatial enhancement approach, which has demonstrated improvements in the perceived spatial accuracy of binaurally rendered signals using head-worn microphone arrays. The foundation of the approach is a parametric sound-field model, which assumes the existence of a single source and an isotropic diffuse com...
Conference Paper
Full-text available
Spherical microphone arrays may be used to capture the directional characteristics of a room acoustic response. Spatial impulse response rendering (SIRR) is a method for parameterizing the response in terms of its principle directional and diffuse components, which allows for subsequent spatially enhanced reproduction of these captured spatial char...
Conference Paper
Full-text available
The spatial super-hearing technology originally proposed by the present research group brings ultrasonic signals into the audible range and auralises them using headphones, in such a manner that the listener is also able to localise the sources through spatial hearing. The signals are captured using a microphone array, and the direction-of-arrival...
Conference Paper
Full-text available
The acoustics of coupled rooms is often more complex than single rooms due to the increase in features such as double-slope decays, direct sound occlusion and anisotropic reverberation. For directional capture, analysis and reproduction of room acoustics, spatial room impulse responses (SRIRs) can be utilised, but measuring SRIRs at multiple positi...
Conference Paper
Full-text available
This paper presents a spatial post-filtering algorithm for passive-sonar systems deploying linear hydrophone arrays. The algorithm provides an attenuation parameter in the time-frequency domain based on the normalised cross-spectral density between two signals, which originate from two coincidentally steered conventional beam-formers. The computed...
Conference Paper
Full-text available
Coupled rooms have a distinct sound energy decay behavior, which exhibits more than one decay time under certain conditions. The sound energy decay analysis in such scenarios requires decay models consisting of multiple exponentials with distinct decay rates and amplitudes. While multi-exponential decay analysis is commonly used in room acoustics,...
Conference Paper
Full-text available
The individual internal representation of the spectral cues that allow distinguishing between front and back is investigated by using the reverse correlation method. The stimuli were noise bursts presented randomly from either a front or back loudspeaker. For each trial, the spectrum of the noise was modified with a 1 ERB spaced gammatone filterban...
Conference Paper
Full-text available
Previous studies have reported that the direction of a band-limited sound stimulus in the median plane is localised based on its centre frequency instead of its actual location. The frequency band that determines this localisation is referred to as the directional-band. However, since most relevant studies employed a coarse localisation response sc...
Article
Full-text available
This exploratory study investigates the phenomenon of the auditory perceived aperture position (APAP): the point at which one feels they are in the boundary between two adjoined spaces, judged only using auditory senses. The APAP is likely the combined perception of multiple simultaneous auditory cue changes, such as energy, reverberation time, env...
Preprint
The decaying sound field in rooms is typically described in terms of energy decay functions (EDFs). Late reverberation can deviate considerably from the ideal diffuse field, for example, in scenes with multiple connected rooms or non-uniform absorption material distributions. This paper proposes the common-slope model of late reverberation. The mod...
Preprint
p>The decaying sound field in rooms is typically described in terms of energy decay functions (EDFs). Late reverberation can deviate considerably from the ideal diffuse field, for example, in scenes with multiple connected rooms or non-uniform absorption material distributions. This paper proposes the common-slope model of late reverberation. The m...
Article
Full-text available
An established model for sound energy decay functions (EDFs) is the superposition of multiple exponentials and a noise term. This work proposes a neural-network-based approach for estimating the model parameters from EDFs. The network is trained on synthetic EDFs and evaluated on two large datasets of over 20 000 EDF measurements conducted in vario...
Article
Full-text available
This letter presents a spatial post-filter that can be employed in linear hydrophone arrays, commonly found in sonar systems, for the task of improving the bearing estimation and noise suppression capabilities of traditional beamformers. The proposed filter is computed in the time-frequency domain as the normalised cross-spectral density between tw...
Article
Full-text available
This article proposes a parametric signal-dependent method for the task of encoding microphone array signals into Ambisonic signals. The proposed method is presented and evaluated in the context of encoding a simulated seven-sensor microphone array, which is mounted on an augmented reality headset device. Given the inherent flexibility of the Ambis...
Article
Full-text available
Spatial room impulse responses (SRIRs) capture room acoustics with directional informa- tion. SRIRs measured in coupled rooms and spaces with non-uniform absorption distribution may exhibit anisotropic reverberation decays and multiple decay slopes. However, noisy mea- surements with low signal-to-noise ratios pose issues in analysis and reproducti...
Preprint
Full-text available
An established model for sound energy decay functions (EDFs) is the superposition of multiple exponentials and a noise term. This work proposes a neural-network-based approach for estimating the model parameters from EDFs. The network is trained on synthetic EDFs and evaluated on two large datasets of over 20000 EDF measurements conducted in variou...
Article
Full-text available
This article proposes a system for object-based six-degrees-of-freedom (6DoF) rendering of spatial sound scenes that are captured using a distributed arrangement of multiple Ambisonic receivers. The approach is based on first identifying and tracking the positions of sound sources within the scene, followed by the isolation of their signals through...
Article
Full-text available
In this article, the application of spatial covariance matching is investigated for the task of producing spatially enhanced binaural signals using head-worn microphone arrays. A two-step processing paradigm is followed, whereby an initial estimate of the binaural signals is first produced using one of three suggested binaural rendering approaches....
Conference Paper
Full-text available
The sound field in coupled rooms or rooms with non-uniform absorptive material distributions can be considerably anisotropic. In such scenarios, the sound energy decays with more than one decay rate, thus making it practical to use a decay model that consists of multiple exponential decays and a noise term. In this work, we use a recently proposed...
Article
This paper introduces difference-spectrum filters that can be used to control the perceived vertical direction of a sound source presented from ear-level loudspeakers. The difference- spectrum filter was designed to mimic the macroscopic changes in the spectral envelope of head-related transfer functions (HRTFs) between a target elevation angle and...
Article
Full-text available
Objective The objective of this study was to investigate the localization ability of bilateral cochlear implant (BiCI) users for virtual sound sources produced over a limited loudspeaker arrangement. Design Ten BiCI users and 10 normal-hearing subjects participated in listening tests in which amplitude- and time-panned virtual sound sources were p...
Article
Full-text available
Auditory localisation accuracy may be degraded when a head-worn device (HWD), such as a helmet or hearing protector, is used. A computational method is proposed in this study for estimating how horizontal plane localisation is impaired by a HWD through distortions of interaural cues. Head-related impulse responses (HRIRs) of different HWDs were mea...
Conference Paper
Full-text available
The success of parametric approaches to spatial sound reproduction and sound field navigation depend on the accuracy of the initial analysis and decomposition of the sound field. In this work, the sector-based high-order extension to intensimetric sound field analysis is evaluated in the context of 3D source localization. The evaluation is performe...
Conference Paper
In this paper, we present a method to auralize acoustic scattering and occlusion of a single rigid sphere with parametric filters and neural networks to provide fast processing and estimation of parameters. The filter parameters are estimated using neural networks based on the geometric parameters of the simulated scene, e.g., relative receiver pos...
Conference Paper
Full-text available
This paper proposes an algorithm for rendering spread sound sources, which are mutually incoherent across their extents, over arbitrary playback formats. The approach involves first generating signals corresponding to the centre of the spread source for the intended playback setup, along with decorrelated variants, followed by defining a diffuse sp...
Conference Paper
Full-text available
Filter banks are an integral part of modern signal processing. They may also be applied to spatial filtering and the employed spatial filters can be designed with a specific shape for the analysis, e. g. suppressing side-lobes. After extracting spatially constrained signals from spherical harmonic (SH) input, i. e. filter bank analysis, many applic...
Conference Paper
Full-text available
Decomposing a sound-field into its individual components and respective parameters can represent a convenient first-step towards offering the user an intuitive means of controlling spatial audio effects and sound-field modification tools. The majority of such tools available today, however, are instead limited to linear combinations of signals or e...
Conference Paper
The perceptual experience of the transition between coupled rooms remains a little investigated area of research. This paper presents a pipeline for auralising the transition between coupled rooms, utilising a time-varying partitioned convolution for fast position-dependent switching between spatial room impulse responses (SRIRs) and parametric bin...
Conference Paper
This paper presents Motus, a new dataset of higher-order Ambisonic room impulse responses. The measurements took place in a single room while varying the amount and placement of furniture. 830 different room configurations were measured with four source-to-receiver configurations, resulting in 3320 room impulse responses in total. The dataset featu...
Article
Full-text available
This work suggests a method of presenting information about the acoustical and geometric properties of a room as spherical images to a machine-learning algorithm to estimate acoustical parameters of the room. The approach has the advantage that the spatial distribution of the properties can be presented in a generic and potentially compact way to m...
Conference Paper
Full-text available
This paper proposes a system for localising and tracking multiple simultaneous acoustical sound sources in the spherical harmonic domain, intended as a precursor for developing parametric sound-field editors and spatial audio effects. The real-time system comprises a novel combination of a direct-path dominance test, grid-less subspace localisation...
Preprint
Full-text available
A fairly recent development in spatial audio is the concept of dividing a spherical sound field into several directionally-constrained regions, or sectors. Therefore, the sphere is spatially partitioned into components that should ideally reconstruct the unit sphere. When distributing such sectors uniformly on the sphere, their set makes up a bank...
Conference Paper
Full-text available
A fairly recent development in spatial audio is the concept of dividing a spherical sound field into several directionally-constrained regions, or sectors. Therefore, the sphere is spatially partitioned into components that should ideally reconstruct the unit sphere. When distributing such sectors uniformly on the sphere, their set makes up a bank...
Article
Full-text available
Beamforming using a circular array of hydrophones may be employed for the task of two-dimensional (2D) underwater sound-field visualisation. In this article, a parametric spatial post-filtering method is proposed, which is specifically intended for applications involving large circular arrays and aims to improve the spatial selectivity of tradition...
Article
Full-text available
Ultrasonic sources are inaudible to humans, and while digital signal processing techniques are available to bring ultrasonic signals into the audible range, there are currently no systems which also simultaneously permit the listener to localise the sources through spatial hearing. Therefore, we describe a method whereby an in-situ listener with no...
Article
This chapter broadly introduces the reader to sound quality. The concept of sound quality has a relatively long history of emergence. Probably the oldest sounds associated with a quality rating have been human speech and singing, then theatre and music‐making, including musical instruments. The development of physics and related mathematics started...
Article
This chapter covers some of the audio and speech techniques. Four areas of application are briefly discussed in separate sections: virtual reality, sonic interaction design, computational auditory scene analysis, and music information retrieval. The audio engine is used to render all sounds that the avatar would hear in the location. Audio content...
Article
Music is different from speech in that its role is not so much to convey linguistic and conceptual content as it is to evoke an aesthetic and emotional experiences. This chapter begins with the discussion of the formation of sounds in acoustical and electric musical instruments. It discusses shortly some basic properties of acoustic and electric in...
Article
Simplified mathematical theories are essential for determining causalities and for predicting the perception evoked by a given stimulus, which provides the evident need for experimental analysis and modelling of hearing. This chapter describes several computational auditory models and their applications. The auditory models are classified as simple...
Article
This chapter discusses the needs and challenges faced in sound reproduction. A wide variety of applications in which sound needs to be reproduced such as: public address, full‐duplex speech communication, audio content production, broadcasting, computer games, virtual reality, accurate reproduction of sound, enhancement of acoustics and active nois...
Article
A common trend in the field of audio is to process the audio signal in the time–frequency domain. This chapter elaborates on the techniques of time–frequency transforms to visualize audio signals and introduces some phenomena, concepts, and issues related to the processing of audio in the time–frequency domain. It describes the time–frequency proce...
Article
Full-text available
While room acoustic measurements can accurately capture the sound field of real rooms, they are usually time consuming and tedious if many positions need to be measured. Therefore this contribution presents the Autonomous Robot Twin System for Room Acoustic Measurements (ARTSRAM) to autonomously capture large sets of room impulse responses with var...
Chapter
This chapter starts by discussing the most fundamental of questions regarding an auditory object: under what conditions does it exist? Two physical attributes limit the audibility of a frequency component of sound: the sound pressure level (SPL) and frequency. The attributes interact with tonal signals; the SPL threshold of audibility depends in a...
Chapter
This chapter discusses various methods used to study the functionality of hearing mechanisms by psychoacoustic means; that is, by presenting sound events to subjects and asking them to perform some tasks in a formal listening test method. Sound stimuli consist of sound events that enter the auditory system of the subject. A psychophysical function...
Chapter
This chapter provides a characterization of research methodologies for communication acoustics and how they evolve. Scientific and engineering knowledge of communication processes has developed over the last hundreds, even thousands, of years. There are three basic ways a scientist or engineer may acquire knowledge about a system or process, such a...
Chapter
Electroacoustic devices, particularly microphones, loudspeakers, and headphones, are essential components in speech communication, audio technology, and multimedia sound. This chapter reviews the electroacoustics of loudspeakers and microphones, the measurement of system responses, basic properties of the responses, and the equalization of the syst...
Chapter
This chapter provides a very brief overview of different technologies in speech coding, synthesis, and recognition. It focuses on acoustics, signal processing, and audio, and the linguistic and statistical aspects are, in many places, treated superficially. The chapter also provides a general description of the main fields in speech technology and...
Chapter
This chapter provides an overview of fundamental concepts in physical acoustics that are considered important in understanding communication by sound and voice, including the wave behaviour of sound in a free field, at material boundaries, and in closed spaces. Sound waves and vibrations can be explained as an alternation between two forms of energ...
Chapter
The purpose of hearing is to capture acoustic vibrations arriving at the ear and analyse the content of the signal to deliver information about the acoustic surroundings to the higher levels in the brain. This chapter provides a brief introduction to both the anatomy and physiology of the auditory system. It focuses on monaural phenomena; that is,...
Chapter
Spatial hearing develops substantially through learning and adaptation to gain more accuracy and better performance in complex environments. This chapter introduces spatial hearing and related concepts. The dummy heads are designed to approximate the head‐related acoustics of a typical human subject. The chapter describes the cues available to huma...
Chapter
Signal processing is the branch of engineering that provides efficient methods and techniques to analyse, synthesize, and transform signals. This chapter presents signal processing fundamentals with regard to sound and voice signals. Signal processing includes a set of methods that are important for understanding communication by sound and voice. I...
Chapter
There are four central quantities or dimensions of psychoacoustics, namely pitch, loudness, timbre, and subjective duration, all of which are relatively well defined and orthogonal to each other, except perhaps timbre. This chapter describes a few of these quantities which are useful in the research on psychoacoustics or in technical applications....
Chapter
This chapter discusses the basic concepts of technical audiology. As background, it provides a brief introduction to hearing impairments and disabilities. A hearing impairment can result in various symptoms. The main symptom is the degraded sensitivity of hearing (hearing loss), which can be in the form of a hearing threshold shift, decreased discr...
Chapter
The acoustic communication mode specific to human beings is speech. This chapter focuses on speech production from both physical and signal processing points of view. Spoken languages exhibit an enormous variation in speech units and their combination. Phonetics is the science that has developed ways to analyse and describe speech units and their f...
Chapter
This chapter describes the psychoacoustic quantities at the lowest level of analysis: pitch, loudness, timbre, and duration, which are more or less related to the physical quantities frequency, level, magnitude spectrum, and time. Pitch is perceived from many types of sounds, such as sinusoids, vocals, instrument sounds, and noisy sounds. However,...
Data
Supplemental material for ["Numerical simulations of near-field head-related transfer functions: Magnitude verification and validation with laser spark sources", J. Acoust. Soc. Am., 148(1), (2020)], assessing the laser-spark acoustical source.
Article
Full-text available
Despite possessing an increased perceptual significance, near-field head-related transfer functions (nf-HRTFs) are more difficult to acquire compared to far-field head-related transfer functions. If properly validated, numerical simulations could be employed to estimate nf-HRTFs: the present study aims to validate the usage of wave-based simulation...
Article
Despite possessing an increased perceptual significance, near-field head-related transfer functions (nf-HRTFs) are more difficult to acquire compared to far-field head-related transfer functions. If properly validated, numerical simulations could be employed to estimate nf-HRTFs: the present study aims to validate the usage of wave-based simulation...
Article
Full-text available
This article details an investigation into the perceptual effects of different rendering strategies when synthesizing loudspeaker array room impulse responses (RIRs) using microphone array RIRs in a parametric fashion. The aim of this rendering task is to faithfully reproduce the spatial characteristics of a captured space, encoded within the input...
Article
Full-text available
Modern spatial audio reproduction techniques with headphones or loudspeakers seek to control the perceived spatial image as accurately as possible in three dimensions. The mechanisms of spatial perception have been studied mainly in the horizontal plane, and this article attempts to shed some light on the corresponding phenomena in the median plane...
Article
Full-text available
The purpose of this article is to detail and evaluate three alternative approaches to soundfield visualization, which all employ the use of spatially localized active-intensity (SLAI) vectors. These SLAI vectors are of particular interest, as they allow direction-of-arrival (DoA) estimates to be extracted in multiple spatially localized sectors, su...
Conference Paper
This contribution proposes a simplified rendering of source directivity patterns for the simulation and auralization of auditory scenes consisting of multiple listeners or sources. It is based on applying directivity filters of arbitrary directivity patterns at multiple, supposedly important directions, and approximating the filter outputs of inter...
Conference Paper
Full-text available
This work presents a machine-learning-based method to estimate the reverberation time of a virtual room for auralization purposes. The models take as input geometric features of the room and output the estimated reverberation time values as function of frequency. The proposed model is trained and evaluated using a novel dataset composed of real-wor...
Conference Paper
Full-text available
This article pertains to parametric rendering of microphone array impulse responses, such that the spatial characteristics of a captured space may be imposed onto a monophonic input signal and reproduced over an array of loudspeakers. Parametric methods operate by analysing a set of spatial parameters, dividing the response into components based on...
Conference Paper
Full-text available
Auditory localization under conflicting dynamic and spectral cues was investigated in a listening experiment where head-motion-coupled amplitude panning was used to create front-back confusions with moving free-field stimuli. Subjects reported whether stimuli of various spectra formed auditory images in the front, rear or both hemiplanes simultaneo...
Conference Paper
Full-text available
A powerful and flexible approach to record or encode a spatial sound scene is through spherical harmonics (SHs), or Ambisonics. An SH- encoded scene can be rendered binaurally by applying SH-encoded head-related transfer functions (HRTFs). Limitations of the recording equipment or computational constraints dictate the spatial reproduc- tion accurac...
Conference Paper
Full-text available
A method for computing and sharpening angular spectra, derived from low-order ambisonic signals, is presented in this paper, which is intended for high-resolution directional sound-field visualisation. The method relies on a re-assignment principle, whereby the directional energy for each grid point is assigned to a new direction, which corresponds...
Article
In this work, a technique to render the acoustic effect of scattering from finite objects in virtual reality is proposed, which aims to provide a perceptually plausible response for the listener, rather than a physically accurate response. The effect is implemented using parametric filter structures and the parameters for the filters are estimated...
Book
Sensory Evaluation of Sound provides a detailed review of the latest sensory evaluation techniques, specifically applied to the evaluation of sound and audio. This three-part book commences with an introduction to the fundamental role of sound and hearing, which is followed by an overview of sensory evaluation methods and associated univariate and...
Conference Paper
The inner hair cells of the mammalian cochlea transform the vibrations of their stereocilia into releases of neurotransmitter at the ribbon synapses, thereby controlling the activity of the afferent auditory fibers. The mechanical-to-neural transduction is a highly nonlinear process and it introduces differences between the frequency-tuning of the...
Conference Paper
Full-text available
Higher-order Ambisonics (HOA) is a flexible recording and reproduction method, which makes it attractive for several applications in virtual and augmented reality. However, the recording of HOA signals with practical compact microphone arrays is limited to a certain frequency range, which depends on the applied microphone array. In this paper, w...