Irene Martin-Morato

Irene Martin-Morato
Tampere University | UTA · computing science unit

PhD

About

19
Publications
794
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
154
Citations

Publications

Publications (19)
Preprint
Full-text available
The Detection and Classification of Acoustic Scenes and Events Challenge Task 4 aims to advance sound event detection (SED) systems in domestic environments by leveraging training data with different supervision uncertainty. Participants are challenged in exploring how to best use training data from different domains and with varying annotation gra...
Preprint
This article describes the Data-Efficient Low-Complexity Acoustic Scene Classification Task in the DCASE 2024 Challenge and the corresponding baseline system. The task setup is a continuation of previous editions (2022 and 2023), which focused on recording device mismatches and low-complexity constraints. This year's edition introduces an additiona...
Preprint
In this paper, we study the use of soft labels to train a system for sound event detection (SED). Soft labels can result from annotations which account for human uncertainty about categories, or emerge as a natural representation of multiple opinions in annotation. Converting annotations to hard labels results in unambiguous categories for training...
Article
Full-text available
Crowdsourcing is a popular tool for collecting large amounts of annotated data, but the specific format of the strong labels necessary for sound event detection is not easily obtainable through crowdsourcing. In this work, we propose a novel annotation workflow that leverages the efficiency of crowdsourcing weak labels, and uses a high number of an...
Preprint
Full-text available
This paper analyzes the outcome of the Low-Complexity Acoustic Scene Classification task in DCASE 2022 Challenge. The task is a continuation from the previous years. In this edition, the requirement for low-complexity solutions were modified including: a limit of 128 K on the number of parameters, including the zero-valued ones, imposed INT8 numeri...
Preprint
Strong labels are a necessity for evaluation of sound event detection methods, but often scarcely available due to the high resources required by the annotation task. We present a method for estimating strong labels using crowdsourced weak labels, through a process that divides the annotation task into simple unit tasks. Based on estimations of ann...
Preprint
This paper presents the details of Task 1A Acoustic Scene Classification in the DCASE 2021 Challenge. The task consisted of classification of data from multiple devices, requiring good generalization properties, using low-complexity solutions. The provided baseline system is based on a CNN architecture and post-training parameters quantization. The...
Preprint
Full-text available
Crowdsourcing has become a common approach for annotating large amounts of data. It has the advantage of harnessing a large workforce to produce large amounts of data in a short time, but comes with the disadvantage of employing non-expert annotators with different backgrounds. This raises the problem of data reliability, in addition to the general...
Article
In the last years, deep convolutional neural networks have become a standard for the development of state-of-the-art audio classification systems, taking the lead over traditional approaches based on feature engineering. While they are capable of achieving human performance under certain scenarios, it has been shown that their accuracy is severely...
Article
Full-text available
Residual learning is known for being a learning framework that facilitates the training of very deep neural networks. Residual blocks or units are made up of a set of stacked layers, where the inputs are added back to their outputs with the aim of creating identity mappings. In practice, such identity mappings are accomplished by means of the so-ca...
Preprint
Residual learning is a recently proposed learning framework to facilitate the training of very deep neural networks. Residual blocks or units are made of a set of stacked layers, where the inputs are added back to their outputs with the aim of creating identity mappings. In practice, such identity mappings are accomplished by means of the so-called...
Article
Low-level audio features are commonly used in many audio analysis tasks such as audio scene classification or acoustic event detection. Due to the variable length of audio signals, it is a common approach to create fixed-length feature vectors consisting of a set of statistics that summarize the temporal variability of such short-term features. To...
Conference Paper
Automatic recognition of multiple acoustic events is an interesting problem in machine listening that generalizes the classical speech/non-speech or speech/music classification problem. Typical audio streams contain a diversity of sound events that carry important and useful information on the acoustic environment and context. Classification is usu...

Network

Cited By