Anton BatlinerTechnical University of Munich | TUM · School of Medicine and Health
Anton Batliner
Dr.
About
475
Publications
104,641
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
13,732
Citations
Introduction
ALMOST ALL MY PAPERS CAN BE DOWNLOADED FROM:
https://opus.bibliothek.uni-augsburg.de/opus4/solrsearch/index/search/searchtype/authorsearch/author/Anton+Batliner/start/0/rows/20/sortfield/year/sortorder/asc
SEE AS WELL GOOGLE SCHOLAR:
http://scholar.google.com/citations?hl=en&user=-6MBiKgAAAAJ&view_op=list_works
Additional affiliations
February 2012 - October 2012
January 1996 - present
January 1997 - December 2012
Publications
Publications (475)
Chronic obstructive pulmonary disease (COPD) is a serious inflammatory lung disease affecting millions of people around the world. Due to an obstructed airflow from the lungs, it also becomes manifest in patients' vocal behaviour. Of particular importance is the detection of an exacerbation episode, which marks an acute phase and often requires hos...
We revisit the INTERSPEECH 2009 Emotion Challenge -- the first ever speech emotion recognition (SER) challenge -- and evaluate a series of deep learning models that are representative of the major advances in SER research in the time since then. We start by training each model using a fixed set of hyperparameters, and further fine-tune the best-per...
The relationship between music and emotion has been addressed within several disciplines, from more historico-philosophical and anthropological ones, such as musicology and ethnomusicology, to others that are traditionally more empirical and technological, such as psychology and computer science. Yet, understanding the link between music and emotio...
Emotion is an important component of music investigated in music psychology. In recent years, the use of computational methods to assess the link between music and emotions has been promoted by advances in music emotion recognition. However, one of the main limitations of applying data-driven approaches to understand such a link is the scarce knowl...
Charisma is considered as one's ability to attract and potentially influence others. Clearly, there can be considerable interest from an artificial intelligence's (AI) perspective to provide it with such skill. Beyond, a plethora of use cases opens up for computational measurement of human charisma, such as for tutoring humans in the acquisition of...
Recent years have seen a rapid increase in digital medicine research in an attempt to transform traditional healthcare systems to their modern, intelligent, and versatile equivalents that are adequately equipped to tackle contemporary challenges. This has led to a wave of applications that utilise AI technologies; first and foremost in the fields o...
The ACM Multimedia 2023 Computational Paralinguistics Challenge addresses two different problems for the first time in a research competition under well-defined conditions: In the Emotion Share Sub-Challenge, a regression on speech has to be made; and in the Requests Sub-Challenges, requests and complaints need to be detected. We describe the Sub-C...
The COVID-19 pandemic has caused massive humanitarian and economic damage. Teams of scientists from a broad range of disciplines have searched for methods to help governments and communities combat the disease. One avenue from the machine learning field which has been explored is the prospect of a digital mass test which can detect COVID-19 from in...
This article contributes to a more adequate modelling of emotions encoded in speech, by addressing four fallacies prevalent in traditional affective computing: First, studies concentrate on few emotions and disregard all other ones (‘closed world’). Second, studies use clean (lab) data or real-life ones but do not compare clean and noisy data in a...
Recent years have seen a rapid increase in digital medicine research in an attempt to transform traditional healthcare systems to their modern, intelligent, and versatile equivalents that are adequately equipped to tackle contemporary challenges. This has led to a wave of applications that utilise AI technologies; first and foremost in the fields o...
Purpose
The aim of this study was to investigate the speech prosody of postlingually deaf cochlear implant (CI) users compared with control speakers without hearing or speech impairment.
Method
Speech recordings of 74 CI users (37 males and 37 females) and 72 age-balanced control speakers (36 males and 36 females) are considered. All participants...
Since the end of the last century, the automatic processing of paralinguistics has been investigated widely and put into practice in many applications, on wearables, smartphones, and computers. In this contribution, we address ethical awareness for paralinguistic applications, by establishing taxonomies for data representations, system designs for...
This is the Proceedings of the ACII Affective Vocal Bursts Workshop and Competition (A-VB). A-VB was a workshop-based challenge that introduces the problem of understanding emotional expression in vocal bursts -- a wide range of non-verbal vocalizations that includes laughs, grunts, gasps, and much more. With affective states informing both mental...
Chronic obstructive pulmonary disease (COPD) causes lung inflammation and airflow blockage leading to a variety of respiratory symptoms; it is also a leading cause of death and affects millions of individuals around the world. Patients often require treatment and hospitalisation, while no cure is currently available. As COPD predominantly affects t...
The ACII Affective Vocal Bursts Workshop & Competition is focused on understanding multiple affective dimensions of vocal bursts: laughs, gasps, cries, screams, and many other non-linguistic vocalizations central to the expression of emotion and to human communication more generally. This year's competition comprises four tracks using a large-scale...
The ACM Multimedia 2022 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the Vocalisations and Stuttering Sub-Challenges, a classification on human non-verbal vocalisations and speech has to be made; the Activity Sub-Challenge aims at beyond-audi...
The article ‘The perception of emotional cues by children in artifcial background noise’, written by Emilia Parada-Cabaleiro, Anton Batliner, Alice Baird and Björn Schuller, was originally published Online First without Open Access. After publication in volume 23, issue 1, page 169–182 it has been decided to make the article an Open Access publicat...
The COVID-19 pandemic has caused massive humanitarian and economic damage. Teams of scientists from a broad range of disciplines have searched for methods to help governments and communities combat the disease. One avenue from the machine learning field which has been explored is the prospect of a digital mass test which can detect COVID-19 from in...
Musical listening is broadly used as an inexpensive and safe method to reduce self-perceived anxiety. This strategy is based on the emotivist assumption claiming that emotions are not only recognised in music but induced by it. Yet, the acoustic properties of musical work capable of reducing anxiety are still under-researched. To fill this gap, we...
As one of the most prevalent neurodegenerative disorders, Parkinson's disease (PD) has a significant impact on the fine motor skills of patients. The complex interplay of different articulators during speech production and realization of required muscle tension become increasingly difficult, thus leading to a dysarthric speech. Characteristic patte...
As one of the most prevalent neurodegenerative disorders, Parkinson’s disease (PD) has a significant impact on the fine motor skills of patients. The complex interplay of different articulators during speech production and realization of required muscle tension become increasingly difficult, thus leading to a dysarthric speech. Characteristic patte...
Renaissance music constitutes a resource of immense richness for Western culture, as shown by its central role in digital humanities. Yet, despite the advance of computational musicology in analysing other Western repertoires, the use of computer-based methods to automatically retrieve relevant information from Renaissance music, e. g., identifying...
The sudden outbreak of COVID-19 has resulted in tough challenges for the field of biometrics due to its spread via physical contact, and the regulations of wearing face masks. Given these constraints, voice biometrics can offer a suitable contact-less biometric solution; they can benefit from models that classify whether a speaker is wearing a mask...
The Coronavirus (COVID-19) pandemic impelled several research efforts, from collecting COVID-19 patients’ data to screening them for virus detection. Some COVID-19 symptoms are related to the functioning of the respiratory system that influences speech production; this suggests research on identifying markers of COVID-19 in speech and other human g...
COVID-19 is a global health crisis that has been affecting our daily lives throughout the past year. The symptomatology of COVID-19 is heterogeneous with a severity continuum. Many symptoms are related to pathological changes in the vocal system, leading to the assumption that COVID-19 may also affect voice production. For the first time, the prese...
The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech; in the Escalation SubCh...
COVID-19 is a global health crisis that has been affecting many aspects of our daily lives throughout the past year. The symptomatology of COVID-19 is heterogeneous with a severity continuum. A considerable proportion of symptoms are related to pathological changes in the vocal system, leading to the assumption that COVID-19 may also affect voice p...
Extensive research has been published on the effects of music in reducing anxiety. Yet, for most of the existing works, a common methodology regarding musical genres and measurement techniques is missing, which limits considerably the comparison between them. In this study,we assess, for the first time, markedly different musical genres with both p...
Modelling of the breath signal is of high interest to both healthcare professionals and computer scientists, as a source of diagnosis-related information, or a means for curating higher quality datasets in speech analysis research. The formation of a breath signal gold standard is, however, not a straightforward task, as it requires specialised equ...
The INTERSPEECH 2020 Computational Paralinguistics Challenge addresses three different problems for the first time in a research competition under well-defined conditions: In the Elderly Emotion Sub-Challenge, arousal and valence in the speech of elderly individuals have to be modelled as a 3-class problem; in the Breathing Sub-Challenge, breathing...
With the advent of ‘heavy Artificial Intelligence’ - big data, deep learning, and ubiquitous use of the internet, ethical considerations are widely dealt with in public discussions and governmental bodies. Within Computational Paralinguistics with its manifold topics and possible applications (modelling of long-term, medium-term, and short-term tra...
Most typically developed individuals have the ability to perceive emotions encoded in speech; yet, factors such as age or environmental conditions can restrict this inherent skill. Noise pollution and multimedia over-stimulation are common components of contemporary society, and have shown to particularly impair a child’s interpersonal skills. Asse...
Early musical sources in white mensural notation-the most common notation in European printed music during the Renaissance-are nowadays preserved by libraries worldwide trough digitalisation. Still, the application of music information retrieval to this repertoire is restricted by the use of digitalisation techniques which produce an uncodified out...
In this article, we study laughter found in child-robot interaction where it had not been prompted intentionally. Different types of laughter and speech-laugh are annotated and processed. In a descriptive part, we report on the position of laughter and speech-laugh in syntax and dialogue structure, and on communicative functions. In a second part,...
We present DEMoS (Database of Elicited Mood in Speech), a new, large database with Italian emotional speech: 68 speakers, some 9 k speech samples. As Italian is under-represented in speech emotion research, for a comparison with the state-of-the-art, we model the ‘big 6 emotions’ and guilt. Besides making available this database for research, our c...
The expression of emotion is an inherent aspect in singing, especially in operatic voice. Yet, adverse acoustic conditions , as, e. g., a performance in open-air, or a noisy analog recording, may affect its perception. State-of-the art methods for emotional speech evaluation have been applied to operatic voice, such as perception experiments, acous...
The Italian madrigal, a polyphonic secular a cappella composition of the 16 th century, is characterised by a strong musical-linguistic relationship, which has made it an icon of the 'Renaissance humanism'. In madrigals, lyrical meaning is mimicked by the music, through the utilisa-tion of a composition technique known as madrigalism. The synergy b...
The INTERSPEECH 2018 Computational Paralinguistics Challenge addresses four different problems for the first time ina research competition under well-defined conditions: In the Atypical Affect Sub-Challenge, four basic emotions annotatedin the speech of handicapped subjects have to be classified; in the Self-Assessed Affect Sub-Challenge, valence s...
In this article, we review the INTERSPEECH 2013 Computational Paralinguistics ChallengE (ComParE) – the first of its kind – in light of the recent developments in affective and behavioural computing. The impact of the first ComParE instalment is manifold: first, it featured various new recognition tasks including social signals such as laughter and...
With the increased usage of internet based services and the mass of digital content now available online, the organisation of such content has become a major topic of interest both commercially and within academic research. The addition of emotional understanding for the content is a relevant parameter not only for music classification within digit...
The outputs of the higher layers of deep pre-trained convolutional neural networks (CNNs) have consistently been shown to provide a rich representation of an image for use in recognition tasks. This study explores the suitability of such an approach for speech-based emotion recognition tasks. First, we detail a new acoustic feature representation,...
The automatic analysis of notated Renaissance music is restricted by a shortfall in codified repertoire. Thousands of scores have been digitised by music libraries across the world, but the absence of symbolically codified information makes these inaccessible for computational evaluation. Optical Music Recognition (OMR) made great progress in addre...
In this work, we present an in-depth analysis of the interdependency between the non-native prosody and the native language (L1) of English L2 speakers, as separately investigated in the Degree of Nativeness Task and the Native Language Task of the INTERSPEECH 2015 and 2016 Computational Paralinguistics ChallengE (ComParE). To this end, we propose...
We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard devi...