Anton Batliner

Anton Batliner
Friedrich-Alexander-University of Erlangen-Nürnberg | FAU · Department of Computer Science, Pattern Recogniton Lab

Dr.

About

446
Publications
75,823
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
10,472
Citations
Additional affiliations
February 2012 - October 2012
Technische Universität München
Position
  • Researcher
January 1997 - December 2012
Friedrich-Alexander-University of Erlangen-Nürnberg
January 1996 - present

Publications

Publications (446)
Preprint
Full-text available
Chronic obstructive pulmonary disease (COPD) causes lung inflammation and airflow blockage leading to a variety of respiratory symptoms; it is also a leading cause of death and affects millions of individuals around the world. Patients often require treatment and hospitalisation, while no cure is currently available. As COPD predominantly affects t...
Preprint
Full-text available
The ACII Affective Vocal Bursts Workshop & Competition is focused on understanding multiple affective dimensions of vocal bursts: laughs, gasps, cries, screams, and many other non-linguistic vocalizations central to the expression of emotion and to human communication more generally. This year's competition comprises four tracks using a large-scale...
Preprint
Full-text available
The ACM Multimedia 2022 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the Vocalisations and Stuttering Sub-Challenges, a classification on human non-verbal vocalisations and speech has to be made; the Activity Sub-Challenge aims at beyond-audi...
Article
Full-text available
The article ‘The perception of emotional cues by children in artifcial background noise’, written by Emilia Parada-Cabaleiro, Anton Batliner, Alice Baird and Björn Schuller, was originally published Online First without Open Access. After publication in volume 23, issue 1, page 169–182 it has been decided to make the article an Open Access publicat...
Preprint
Full-text available
The COVID-19 pandemic has caused massive humanitarian and economic damage. Teams of scientists from a broad range of disciplines have searched for methods to help governments and communities combat the disease. One avenue from the machine learning field which has been explored is the prospect of a digital mass test which can detect COVID-19 from in...
Article
Full-text available
Musical listening is broadly used as an inexpensive and safe method to reduce self-perceived anxiety. This strategy is based on the emotivist assumption claiming that emotions are not only recognised in music but induced by it. Yet, the acoustic properties of musical work capable of reducing anxiety are still under-researched. To fill this gap, we...
Preprint
Full-text available
As one of the most prevalent neurodegenerative disorders, Parkinson's disease (PD) has a significant impact on the fine motor skills of patients. The complex interplay of different articulators during speech production and realization of required muscle tension become increasingly difficult, thus leading to a dysarthric speech. Characteristic patte...
Article
As one of the most prevalent neurodegenerative disorders, Parkinson’s disease (PD) has a significant impact on the fine motor skills of patients. The complex interplay of different articulators during speech production and realization of required muscle tension become increasingly difficult, thus leading to a dysarthric speech. Characteristic patte...
Conference Paper
Full-text available
Renaissance music constitutes a resource of immense richness for Western culture, as shown by its central role in digital humanities. Yet, despite the advance of computational musicology in analysing other Western repertoires, the use of computer-based methods to automatically retrieve relevant information from Renaissance music, e. g., identifying...
Article
The sudden outbreak of COVID-19 has resulted in tough challenges for the field of biometrics due to its spread via physical contact, and the regulations of wearing face masks. Given these constraints, voice biometrics can offer a suitable contact-less biometric solution; they can benefit from models that classify whether a speaker is wearing a mask...
Article
The Coronavirus (COVID-19) pandemic impelled several research efforts, from collecting COVID-19 patients’ data to screening them for virus detection. Some COVID-19 symptoms are related to the functioning of the respiratory system that influences speech production; this suggests research on identifying markers of COVID-19 in speech and other human g...
Article
COVID-19 is a global health crisis that has been affecting our daily lives throughout the past year. The symptomatology of COVID-19 is heterogeneous with a severity continuum. Many symptoms are related to pathological changes in the vocal system, leading to the assumption that COVID-19 may also affect voice production. For the first time, the prese...
Preprint
Full-text available
The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech; in the Escalation SubCh...
Preprint
Full-text available
COVID-19 is a global health crisis that has been affecting many aspects of our daily lives throughout the past year. The symptomatology of COVID-19 is heterogeneous with a severity continuum. A considerable proportion of symptoms are related to pathological changes in the vocal system, leading to the assumption that COVID-19 may also affect voice p...
Article
Full-text available
Extensive research has been published on the effects of music in reducing anxiety. Yet, for most of the existing works, a common methodology regarding musical genres and measurement techniques is missing, which limits considerably the comparison between them. In this study,we assess, for the first time, markedly different musical genres with both p...
Conference Paper
Full-text available
Modelling of the breath signal is of high interest to both healthcare professionals and computer scientists, as a source of diagnosis-related information, or a means for curating higher quality datasets in speech analysis research. The formation of a breath signal gold standard is, however, not a straightforward task, as it requires specialised equ...
Conference Paper
Full-text available
The INTERSPEECH 2020 Computational Paralinguistics Challenge addresses three different problems for the first time in a research competition under well-defined conditions: In the Elderly Emotion Sub-Challenge, arousal and valence in the speech of elderly individuals have to be modelled as a 3-class problem; in the Breathing Sub-Challenge, breathing...
Article
Full-text available
With the advent of ‘heavy Artificial Intelligence’ - big data, deep learning, and ubiquitous use of the internet, ethical considerations are widely dealt with in public discussions and governmental bodies. Within Computational Paralinguistics with its manifold topics and possible applications (modelling of long-term, medium-term, and short-term tra...
Article
Full-text available
Most typically developed individuals have the ability to perceive emotions encoded in speech; yet, factors such as age or environmental conditions can restrict this inherent skill. Noise pollution and multimedia over-stimulation are common components of contemporary society, and have shown to particularly impair a child’s interpersonal skills. Asse...
Conference Paper
Full-text available
Early musical sources in white mensural notation-the most common notation in European printed music during the Renaissance-are nowadays preserved by libraries worldwide trough digitalisation. Still, the application of music information retrieval to this repertoire is restricted by the use of digitalisation techniques which produce an uncodified out...
Preprint
Full-text available
In this article, we study laughter found in child-robot interaction where it had not been prompted intentionally. Different types of laughter and speech-laugh are annotated and processed. In a descriptive part, we report on the position of laughter and speech-laugh in syntax and dialogue structure, and on communicative functions. In a second part,...
Article
Full-text available
We present DEMoS (Database of Elicited Mood in Speech), a new, large database with Italian emotional speech: 68 speakers, some 9 k speech samples. As Italian is under-represented in speech emotion research, for a comparison with the state-of-the-art, we model the ‘big 6 emotions’ and guilt. Besides making available this database for research, our c...
Conference Paper
Full-text available
The expression of emotion is an inherent aspect in singing, especially in operatic voice. Yet, adverse acoustic conditions , as, e. g., a performance in open-air, or a noisy analog recording, may affect its perception. State-of-the art methods for emotional speech evaluation have been applied to operatic voice, such as perception experiments, acous...
Conference Paper
Full-text available
The Italian madrigal, a polyphonic secular a cappella composition of the 16 th century, is characterised by a strong musical-linguistic relationship, which has made it an icon of the 'Renaissance humanism'. In madrigals, lyrical meaning is mimicked by the music, through the utilisa-tion of a composition technique known as madrigalism. The synergy b...
Conference Paper
Full-text available
The INTERSPEECH 2018 Computational Paralinguistics Challenge addresses four different problems for the first time ina research competition under well-defined conditions: In the Atypical Affect Sub-Challenge, four basic emotions annotatedin the speech of handicapped subjects have to be classified; in the Self-Assessed Affect Sub-Challenge, valence s...
Article
In this article, we review the INTERSPEECH 2013 Computational Paralinguistics ChallengE (ComParE) – the first of its kind – in light of the recent developments in affective and behavioural computing. The impact of the first ComParE instalment is manifold: first, it featured various new recognition tasks including social signals such as laughter and...
Conference Paper
Full-text available
With the increased usage of internet based services and the mass of digital content now available online, the organisation of such content has become a major topic of interest both commercially and within academic research. The addition of emotional understanding for the content is a relevant parameter not only for music classification within digit...
Conference Paper
Full-text available
The outputs of the higher layers of deep pre-trained convolutional neural networks (CNNs) have consistently been shown to provide a rich representation of an image for use in recognition tasks. This study explores the suitability of such an approach for speech-based emotion recognition tasks. First, we detail a new acoustic feature representation,...
Conference Paper
Full-text available
The automatic analysis of notated Renaissance music is restricted by a shortfall in codified repertoire. Thousands of scores have been digitised by music libraries across the world, but the absence of symbolically codified information makes these inaccessible for computational evaluation. Optical Music Recognition (OMR) made great progress in addre...
Conference Paper
In this work, we present an in-depth analysis of the interdependency between the non-native prosody and the native language (L1) of English L2 speakers, as separately investigated in the Degree of Nativeness Task and the Native Language Task of the INTERSPEECH 2015 and 2016 Computational Paralinguistics ChallengE (ComParE). To this end, we propose...
Article
Full-text available
We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard devi...
Conference Paper
Full-text available
We address ethical considerations concerning iHEARu-PLAY, a web-based, crowdsourced, multiplayer game for large-scale, real-life corpus collection and multi-label, holistic data annotation for advanced paralinguistic tasks. While playing the game, users are recorded or perform labelling tasks, compete with other players, and are rewarded with score...
Book
Full-text available
In 2015, the Workshop on Speech and Language Technology in Education (SLaTE) took place in Leipzig, Germany, as a Satellite Workshop of Interspeech 2015 in Dresden. This workshop was the meeting of the correspondent ISCA Special Interest Group, organized by the Pattern Recognition Lab of Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) in co...
Conference Paper
Full-text available
The INTERSPEECH 2014 Computational Paralinguistics Challenge provides for the first time a unified test-bed for the automatic recognition of speakers' cognitive and physical load in speech. In this paper, we describe these two Sub-Challenges, their conditions, baseline results and experimental procedures, as well as the COMPARE baseline features ge...
Article
The INTERSPEECH 2012 Speaker Trait Challenge aimed at a unified test-bed for perceived speaker traits – the first challenge of this kind: personality in the five OCEAN personality dimensions, likability of speakers, and intelligibility of pathologic speakers. In the present article, we give a brief overview of the state-of-the-art in these three fi...
Conference Paper
Full-text available
The degree of sleepiness in the Sleepy Language Corpus from the Interspeech 2011 Speaker State Challenge is predicted with regression and a very large feature vector. Most notable is the great gender difference which can mainly be attributed to females showing their sleepiness less than males do.