
Alois SontacchiKunstuniversität Graz | KUG · Institute of Electronic Music and Acoustics
Alois Sontacchi
Doctor of Engineering
About
120
Publications
37,102
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
792
Citations
Citations since 2017
Publications
Publications (120)
Singing voice directivity for five sustained German vowels /a:/, /e:/, /i:/, /o:/, /u:/ over a wide pitch range was investigated using a multichannel microphone array with high spatial resolution along the horizontal and vertical axes. A newly created dataset allows to examine voice directivity in classical singing with high resolution in angle and...
Voice directivity has an influence on the perceived acoustics for both the singer/speaker and the audience. One of the most important aspects of voice directivity in a room is the direct-to-reverberant energy ratio (D/R ratio) at the listening position. The more focused the voice directivity is, the higher the D/R ratio. This is why voice directivi...
Voice disorders due to strenuous usage of unhealthy voice qualities are a common problem in professional singing. In order to minimize the risk of these voice disorders, vital feedback can be given by making aware of one's sung voice quality. This work presents the design task of a vowel and voice quality indication tool which can enable such a fee...
This paper focuses on the analysis and evaluation of acoustical design criteria to produce a plausible 3D sound field solely via headrest with integrated loudspeakers at the driver/passenger seats in the car cabin. Existing audio systems in cars utilize several distributed loudspeakers to support passengers with sound. Such configurations suffer fr...
Directivity of speech and singing is determined primarily by the morphology of a person, i.e., head size, torsodimensions, posture, and vocal tract. Previous works have suggested from measurements that voice directivity insinging is controlled unintentionally by spectral emphasis in the range of 2–4 kHz. The attempt is made to try toidentify to wha...
fnma Magazin 02/2020 - E-Assessment und E-Examinations
https://www.fnma.at/content/download/2087/magazine_download/2020-02.pdf
The constant-Q transform (CQT) is a valuable tool for music information retrieval, e.g. for chroma calculation and harmonic analysis. In this E-Brief, we propose a block based, real-time capable, efficient analysis algorithm resting upon a subsampling technique performed with fast Fourier transform. In addition, advanced features such as time resol...
Directivity of speech and singing is primarily determined by the physiology of a person and therefore by the head size, torso dimensions and posture. Previous works have concluded that in singing only the intentional spectral (due to the singer’s formant) defines the directionality of a singing in an room acoustical sense. Nevertheless, our work ha...
Sources from the frontal direction are still particularly challenging in binaural reproduction, as there are virtually no interaural time- and level differences. The perceived image of a binaural reproduction typically suffers from a vertical mislocalization and in-head localization. In the literature, different reasons for this problem can be foun...
headphone with additional tiny loudspeakers to facilitate individualized binaural reproduction
Ambisonics is a production format for 3D audio that is based on the representation of the sound field excitation as decomposition into orthonormal basis functions, the so-called spherical harmonics. This representation allows for a production process that is independent of the target playback system, be it loudspeakers or headphones. The concert ni...
Loopers become more and more popular due to their growing features and capabilities, not only in live performances but also as a rehearsal tool. These effect units record a phrase and play it back in a loop. The start and stop positions of the recording are typically the player's start and stop taps on a foot switch. However, if these cues are not...
Today, the number of downsized engines with two or three cylinders is increasing due to an increase in fuel efficiency. However, downsized engines exhibit unbalanced interior sound in the range of their optimal engine speed, largely because of their dominant engine orders. In particular, the sound of two-cylinder engines yields half the perceived e...
This contribution presents active sound generation (ASG) for interior
sound enhancement to assist or improve sound feedback in
either down-sized combustion engines or electric engines. It reports
evaluation results from two studies about the description of engine
sounds and the influence of sound feedback on driving behavior.
Preliminary results in...
Ambisonics is a 3D recording and playback method that is based on the representation of the sound field excitation as a decomposition into spherical harmonics. This representation facilitates spatial sound production that is independent of the playback system. The adaptation to a given playback system (loudspeakers or motion-tracked headphones) is...
PEAQ (Perceptual evaluation of audio quality) is an international standard for quality prediction of wide-band audio codecs (coder-decoder) according to ITU-R BS.1387, developed by an international consortium of leading audio quality experts in 1999. The commercially available implementation of PEAQ offers two analysis models (basic and advanced) w...
Due to future directives of the European Union regarding fuel consumption and CO2 emissions the automotive industry is forced to develop new and unconventional technologies. These include for example stop-start-systems, cylinder deactivation or even reduction of the number of cylinders which however lead to unusual acoustical perceptions and custom...
Today, the number of downsized engines with two or three cylinders is increasing due to an increase in fuel efficiency. However, downsized engines exhibit unbalanced interior sound in the range of their optimal engine speed, largely because of their dominant engine orders. In particular, the sound of two-cylinder engines yields half the perceived e...
When employing in-car active sound generation (ASG) and active noise cancellation (ANC), the accurate knowledge of the vehicle interior sound pressure distribution in magnitude as well as phase is paramount. Revisiting the ANC concept, relevant boundary conditions in spatial sound fields will be addressed. Moreover, within this study the controllab...
Distributed microphone arrays exploit the spatial diversity of an acoustic scene and obtain higher signal-to-noise ratios than compact microphone arrays that sample the sound field only locally. However, as distances between distributed microphones grow, wired connections become infeasible and Wireless Acoustic Sensor Networks (WASN) need to be emp...
Directional detection of sound sources under defined ambience conditions using a spherical microphone array (Eigenmike) is examined. The used spatial detection algorithm correlates synthesized spherical wave spectra derived from theory with a set of concrete spherical spectra calculated from measured impulse responses. Thus, measurement signals wer...
Pitch shifting of polyphonic music is usually performed by manipulating the time-frequency representation of the input signal. Most approaches proposed in the past are based on the Fourier transform although its linear frequency bin spacing is known to be inadequate to some degree for analyzing and processing music signals. Recently invertible cons...
Durch den anhaltenden Trend zu Downsizing und strengeren Emissionsregulierungen steigt die Notwendigkeit der Ueberpruefung der Soundqualitaet in der Motor- und Fahrzeugentwicklung. Um objektive Beurteilungen der Soundqualitaet von Motoren durchfuehren zu koennen, wurden in den letzten Jahren psychoakustische Parameter wie CKI (Combustion Knocking I...
This article presents a new database of speech produced under cognitive load for the purpose of non-invasive psychological stress monitoring. The voices and the heart rates of eight airline pilots were recorded while completing an advanced flight simulation programme in a level D full flight simulator. Focusing on real-world applicability, the expe...
Pitch-scale modifications of polyphonic music are usually performed by manipulating the time-frequency representation of the input signal. Most approaches proposed in the past are thereby based on the Fourier transform although its linear frequency bin spacing is known to be inadequate to some degree for analysing and processing music signals. Rece...
The noise reduction of active noise cancellation (ANC) headphones is usually assessed with measurements on different ear simulators. This assessment however is difficult because the ANC depends on the tightness of the wearing situations. Different ear simulators provoke different leakage situations and therefore lead to different ANC results. We co...
he presented research project “Acoustic Interface for tremor Analysis” aims at the development of methods for real-time acoustical tremor diagnosis. Based on the analysis and sonification of three dimensional acceleration data of hand movements of tremor patients, differences among tremor types are made audible and clearly recognizable. The sonific...
In the past the exterior and interior noise level of vehicles has been largely reduced to follow stricter legislation and due to the demand of the customers. As a consequence, the noise quality and no longer the noise level inside the vehicle plays a crucial role. For an economic development of new powertrains it is important to assess noise qualit...
A computationally efficient 3D real time rendering engine for binaural sound reproduction via headphones is presented. Binaural sound reproduction requires to filter the virtual sound source signals with head related transfer functions (HRTFs). To improve humans localization capabilities head tracking as well as room simulation have to be incorpora...
This paper presents an intuitive pointing method for measuring the perceived direction in 3D localization experiments. The method uses a motion tracked toy-gun as pointing device and can be used from all positions in any nearly convex surrounding hull or loudspeaker setup, as the pointed direction is computed from the piercing point of the gun's di...
This article presents a subjective evaluation of a proprietary sub-band ADPCM (Adaptive Differential Pulse Code Modulation) codec for digital wireless transmission. The evaluation is carried out with 40 expert listeners and is divided into several experimental stages: First, the audibility threshold for codec artifacts is determined for each freque...
Ambisonics is a 3D audio surround rendering and representation approach based on spherical harmonics with loud-speaker independent transmission channels. Although it was developed in the seventies and the techniques are well known, there are disagreements how to normalize, store and exchange Ambisonic data. This paper's mission is to propose a stan...
Most genre classification systems are based on feature vectors which are either computed from the whole audio file or short arbitrary excerpts. However structural information related to the musical form of songs has not been considered so far. To account for this musically relevant information, we propose to perform an additional segment detection...
We present a novel method to adjust the perceived width of a phan-tom source by varying the deterministic inter channel time differ-ence (ICT D) in a pair of signals over frequency. In contrast to given literature that focuses on random phase over frequency, our paper considers a deterministic approach that is open to a more systematic evaluation....
This paper presents a robust, accurate sound source localization method using a compact, near-coincident microphone array. We derive features by combining the microphone signals and deter-mine the direction of a single sound source by similarity matching. Therefore, the observed features are compared with a set of previ-ously measured reference fea...
In this article, a model that predicts the transparency of mixdowns is proposed. The Masked-to-Unmasked- Ratio relates the original loudness of an instrument to its loudness in the mix. In order to assess this new measure a listening test is conducted. It is shown that instruments with a Masked-to-Unmasked-Ratio of 10 % or smaller are critical in m...
Air traffic controllers listen to pilots’ radio communications either by headphones or by loudspeakers. As air traffic increases, there is a tendency to use headphones to reduce the ambient noise level in the control room. Headphones are less disturbing for neighbouring controllers but may be uncomfortable to wear after long periods. This paper inv...
Ambisonics can be regarded as a holophonic sound field rendering technique that decodes spherical harmonic encoded source-signals to discrete loudspeakers arranged on a sphere. The aim is the re-synthesis of sound sources perceivable from certain spatial directions, either by reproducing dedicated Ambisonics microphone recordings or synthetic signa...
Human speech is a promising signal source for workload monitoring purposes due to (a) its sensitivity to a variety of aspects of workload and (b) the facility of non-intrusive signal capturing. Many approaches in this field of research have been presented over the last years, but without leading to a working implementation in civil ATC.
In this pap...
Die Ermittlung komplexer übertragungseigenschaften einer Fahrzeugkarosserie wird unter anderem mittels der Transferpfadanalyse (TPA) durchgeführt. Die Qualität der Ergebnisse ist dabei stark von der messtechnischen Erfassung und der mathematischen Modellierung bestimmt. Die AVL List GmbH entwickelte neue erfolgversprechende Ansätze innerhalb eines...
An evaluation of the complex properties of the NVH transfer of a vehicle body is done by transfer path analysis (TPA). Result quality is mainly depending on measurement technology and the applied mathematical models. AVL List GmbH developed a new promising approach during a research project and presents the simulation tool TPA-Form, which allows a...
Over the past few years there has been growing awareness of the need for an agreed format for ambisonic files and for the interchange of other ambisonic signal sets. Here we propose a standard that is both simple and intended to be future proof. The proposal is the outcome of many months of discussion, on the Web and by email, and of physical meeti...
Today's electro-acoustics concert halls are usually equipped with a multi speaker setting. Some of these environments have the ability to play music on an spatialised 3D space including virtual acoustics. For remote performances, a concert in a source place is transmitted to a remote place where the audience is located. The concert's "audio signatu...
Within this paper different common approaches are discussed which have the potential to establish a controllable sound field within a restricted area based on loudspeaker setups. Therefore the usage of headphones which is demanding over long time periods can be avoided. In the case of air traffic control at controller working positions this inventi...
This study presents the results from localization experiments of virtual sound sources using a 12 channel, nearly circular 2D Ambisonics system. The perceived direction of the sound and a subjective rating of the localiz ation accuracy has been assigned to each virtual source. As playback methods, Ambisonics decoders with different order and spatia...
The approach to realise periphonic sound field reproduction based on spherical harmonics (multi-pole theory) has already been well-known as Ambisonics and Higher Order Ambisonics, respectively. By the aid of an N-dimensional orthogonal set of vectors any arbitrary source free sound field can be described. Reproduction is realized by projection of t...
Head related transfer functions (HRTFs) describe the physical path from an acoustical source to the ears. It can be gained within the relation of two measurements. The first will give the reference sound pressure in the virtual middle of the head the second has to be done in both ears. In literature exhaustive investigations concerning the idealize...
The implementation of a prototype to establish a controllable sound field within a restricted area utilizing distributed loudspeakers is presented. Based on the near field beam-forming approach a demonstrator setup has been developed and implemented. The proposed solution should be a primary step towards providing a convincing alternative instead o...
Die Erfindung betrifft ein Verfahren zur Berechnung von richtungskorrigierten Übertragungsfunktionen (FRFx, FRFy, FRFz) und/oder richtungskorrigierten Impedanzgrößen in einer Transferpfadanalyse einer schwingenden Struktur, wobei zumindest in einem Einkoppelpunkt zumindest eine Kraft (F) eingeleitet und zumindest ein Antwortsignal auf die eingeleit...
This paper focuses on various application scenarios based on the wave field synthesis (WFS) approach which have been implemented and/or investigated in our laboratories lately. Within the few different selected scenarios, we try to show the possibility to combine different state-of-the-art audio rendering approaches to obtain an efficient solution...
One part of our pro ject "Virtual Gamelan Graz" (VGG) deals with the analysis and re-synthesis of acoustic radi- ation considering selected Gamelan instruments. Spheri- cal loudspeaker arrays seem to be particularly appropri- ate for the re-synthesis task. This kind of sound source consists of a solid spherical body, into which individual, seperate...