Hanna Pamuła’s research while affiliated with AGH University of Krakow and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (12)


Fig. 3. Temporal profiles of each dataset. We show the empirical distributions (kde smoothed) of durations of marked regions, for the foreground (POS events) and negative regions (all non-POS regions) separately.
Fig. 4. Values of similarity between the annotated calls and the first 5 events (shots), and stereotypy for each class in the evaluation set. Classes are indicated in the horizontal axis by DatasetName_ClassName. Both factors are computed using a similarity metric based on the average maximum cross correlation between events. It ranges between 0 and 1, where values closer to 1 represent higher similarity. (the details on how these values are computed are presented in Appendix A.2).
Fig. 5. F-Score results of 2022 and 2023 systems on each dataset of the 2022 evaluation set. Systems are ordered by overall highest scoring rank on the evaluation set.
Fig. 6. Fscore (%) results by class in the evaluation set. Note that QU and MS datasets only contain a single class and thus are not represented here. The systems are ordered by overall highest scoring rank on the evaluation set.
-Score results for the different system variations on the evaluation and vali- dation sets. First row refers to the unchanged submitted system, all the other systems are simple modifications to this.
Learning to detect an animal sound from five examples
  • Article
  • Full-text available

August 2023

·

116 Reads

·

43 Citations

Ecological Informatics

·

Shubhr Singh

·

·

[...]

·

Dan Stowell
Download

Figure 2: F-Score results by team (best submission only). Systems are ordered from least to highest scoring rank on the evaluation set.
Few-shot bioacoustic event detection at the DCASE 2023 challenge

June 2023

·

109 Reads

Few-shot bioacoustic event detection consists in detecting sound events of specified types, in varying soundscapes, while having access to only a few examples of the class of interest. This task ran as part of the DCASE challenge for the third time this year with an evaluation set expanded to include new animal species, and a new rule: ensemble models were no longer allowed. The 2023 few shot task received submissions from 6 different teams with F-scores reaching as high as 63% on the evaluation set. Here we describe the task, focusing on describing the elements that differed from previous years. We also take a look back at past editions to describe how the task has evolved. Not only have the F-score results steadily improved (40% to 60% to 63%), but the type of systems proposed have also become more complex. Sound event detection systems are no longer simple variations of the baselines provided: multiple few-shot learning methodologies are still strong contenders for the task.


Learning to detect an animal sound from five examples

May 2023

·

298 Reads

Automatic detection and classification of animal sounds has many applications in biodiversity monitoring and animal behaviour. In the past twenty years, the volume of digitised wildlife sound available has massively increased, and automatic classification through deep learning now shows strong results. However, bioacoustics is not a single task but a vast range of small-scale tasks (such as individual ID, call type, emotional indication) with wide variety in data characteristics, and most bioacoustic tasks do not come with strongly-labelled training data. The standard paradigm of supervised learning, focussed on a single large-scale dataset and/or a generic pre-trained algorithm, is insufficient. In this work we recast bioacoustic sound event detection within the AI framework of few-shot learning. We adapt this framework to sound event detection, such that a system can be given the annotated start/end times of as few as 5 events, and can then detect events in long-duration audio -- even when the sound category was not known at the time of algorithm training. We introduce a collection of open datasets designed to strongly test a system's ability to perform few-shot sound event detections, and we present the results of a public contest to address the task. We show that prototypical networks are a strong-performing method, when enhanced with adaptations for general characteristics of animal sounds. We demonstrate that widely-varying sound event durations are an important factor in performance, as well as non-stationarity, i.e. gradual changes in conditions throughout the duration of a recording. For fine-grained bioacoustic recognition tasks without massive annotated training data, our results demonstrate that few-shot sound event detection is a powerful new method, strongly outperforming traditional signal-processing detection methods in the fully automated scenario.


Figure 1: F-Score results by dataset. Systems are ordered by highest scoring rank on the evaluation set.
Few-shot bioacoustic sound event detection at the DCASE2022 challenge

July 2022

·

155 Reads

Few-shot sound event detection is the task of detecting sound events, despite having only a few labelled examples of the class of interest. This framework is particularly useful in bioacoustics, where often there is a need to annotate very long recordings but the expert annotator time is limited. This paper presents an overview of the second edition of the few-shot bioacoustic sound event detection task included in the DCASE 2022 challenge. A detailed description of the task objectives, dataset, and baselines is presented, together with the main results obtained and characteristics of the submitted systems. This task received submissions from 15 different teams from which 13 scored higher than the baselines. The highest F-score was of 60% on the evaluation set, which leads to a huge improvement over last year's edition. Highly-performing methods made use of prototypical networks, transductive learning, and addressed the variable length of events from all target classes. Furthermore, by analysing results on each of the subsets we can identify the main difficulties that the systems face, and conclude that few-show bioacoustic sound event detection remains an open challenge.


Figure 1: F-Score results by dataset. Systems are ordered by highest scoring rank on the evaluation set.
Few-shot bioacoustic event detection at the DCASE 2022 challenge

July 2022

·

62 Reads

·

2 Citations

Few-shot sound event detection is the task of detecting sound events, despite having only a few labelled examples of the class of interest. This framework is particularly useful in bioacoustics, where often there is a need to annotate very long recordings but the expert annotator time is limited. This paper presents an overview of the second edition of the few-shot bioacoustic sound event detection task included in the DCASE 2022 challenge. A detailed description of the task objectives, dataset, and baselines is presented, together with the main results obtained and characteristics of the submitted systems. This task received submissions from 15 different teams from which 13 scored higher than the baselines. The highest F-score was of 60% on the evaluation set, which leads to a huge improvement over last year's edition. Highly-performing methods made use of prototypical networks, transductive learning, and addressed the variable length of events from all target classes. Furthermore, by analysing results on each of the subsets we can identify the main difficulties that the systems face, and conclude that few-show bioacoustic sound event detection remains an open challenge.


Figure 11. Values of the í µí±‘ piezoelectric coefficient estimated from the FEM simulations (App is a peak-to-peak excitation amplitude): (a) for the composite beam containing an initial 50% PZT; (b) for the composite beam containing an initial 40% PZT.
Material parameters of the composite layers.
Piezoelectric Particulate Composite for Energy Harvesting from Mechanical Vibration

November 2020

·

159 Reads

·

8 Citations

Energy harvesting from mechanical vibration of buildings is usually realized by the use of devices, in which the main element is a prismatic beam with a rectangular cross-section. The beam has been the subject of scientific research; it is usually constructed with a carrying substrate that does not have piezoelectric characteristics and from piezoelectric material. In contrast, this investigation sought to create a beam structure with a piezoelectric composite only. The entire beam structure was made of a prototype piezoelectric particulate composite. Based on courses of voltage obtained in laboratory experiments and known geometry of the specimens, a series of finite element method (FEM) simulations was performed, aiming to estimate the piezoelectric coefficient d31 value at which the mentioned voltage could be achieved. In each specimen, sedimentation caused the formation of two distinct layers: top and bottom. The experiments revealed that the presented prototype piezoelectric particulate composite converts mechanical stress to electric energy in bending mode, which is used in energy harvesting from mechanical vibration. It is self-supporting and thus a carrying substrate is not required in the harvester structure.


Figure 1 Label occurrences on different regions. Number of occurrences of each sound type in recordings collected from Spain, Southern France and Central France. Full-size DOI: 10.7717/peerjcs.223/fig-1
Figure 2 Number of active classes throughout the dataset. Distribution of number of active classes in dataset recordings. Full-size DOI: 10.7717/peerjcs.223/fig-2
Figure 6 Number of simultaneous active classes over the total duration of the data. Distribution of simultaneous number of active classes on the total duration of the recordings. Full-size DOI: 10.7717/peerjcs.223/fig-6
An example of NIPS4Bplus temporal annotations.
NIPS4Bplus: a richly annotated birdsong audio dataset

October 2019

·

420 Reads

·

33 Citations

Recent advances in birdsong detection and classification have approached a limit due to the lack of fully annotated recordings. In this paper, we present NIPS4Bplus, the first richly annotated birdsong audio dataset, that is comprised of recordings containing bird vocalisations along with their active species tags plus the temporal annotations acquired for them. Statistical information about the recordings, their species specific tags and their temporal annotations are presented along with example uses. NIPS4Bplus could be used in various ecoacoustic tasks, such as training models for bird population monitoring, species classification, birdsong vocalisation detection and classification.


Towards the Acoustic Monitoring of Birds Migrating at Night

June 2019

·

406 Reads

·

8 Citations

Biodiversity Information Science and Standards

Every year billions of birds migrate between their breeding and wintering areas. As birds are an important indicator in nature conservation, migratory bird studies have been conducted for many decades, mostly by bird-ringing programmes and direct observation. However, most birds migrate at night, and therefore much information about their migration is lost. Novel methods have been developed to overcome this difficulty; including thermal imaging, radar, geolocation techniques, and acoustic recognition of bird calls. Many bird species are detected by their characteristic sounds. This method of identification occurs more often than by direct observation, and therefore recordings are widely used in avian research. The commonly used approach is to record the birds automatically, and to manually study the bird sounds in the recordings afterwards (Furnas and Callas 2015, Frommolt 2017). However, the tagging of recordings is a tedious and time-consuming process that requires expert knowledge, and, as a result, automatic detection of flight calls is in high demand. The first experiments towards this used energy thresholds or template matching (Bardeli et al. 2010, Towsey et al. 2012), and later on the machine and deep learning methods were applied (Stowell et al. 2018). Nevertheless, not many studies have focused specifically on night flight calls (Salamon et al. 2016, Lostanlen et al. 2018). Such acoustic monitoring could complement daytime avian research, especially when the field recording station is close to the bird-ringing station, as it is in our project. In this study, we present the initial results of a long-term bird audio monitoring project using automatic methods for bird detection. Passive acoustic recorders were deployed at a narrow spit between a lake and the Baltic sea in Dąbkowice, West Pomeranian Voivodeship, Poland . We recorded bird calls nightly from sunset till sunrise during the passerine autumn migration for 3 seasons. As a result, we collected over 3000 hours of recordings each season. We annotated a subset of over 50 hours, from different nights with various weather conditions. As avian flight calls are sporadic and short, we created a balanced set for training - recordings were divided into partially overlapping 500-ms clips, and we retained all clips containing calls and created about the same number of clips without bird sounds. Different signal representations were then examined (e.g. mel-spectrograms and multitaper). Afterwards, various convolutional neural networks were checked and their performance was compared using the area under the receiver operating characteristic curve (AUC) measure. Moreover, an initial attempt was made to take advantage of the transfer learning from image classification models. The results obtained by the deep learning methods are promising (AUC exceeding 80%), but higher bird detection accuracy is still needed. For a chosen bird species – Song thrush ( Turdus philomelos ) – we observed a correlation between calls recorded at night and birds caught in the nets during the day. This fact, as well as the promising results from the detection of calls from long-term recordings, indicate that acoustic monitoring of nocturnal birds has great potential and could be used to supplement the research of the phenomenon of seasonal bird migration.


Fig. 2. Number of occurrences of each sound type in recordings collected from Spain, Southern France and Central France.
Fig. 3. Distribution of number of active classes in dataset recordings.
Fig. 5. Distribution of simultaneous number of active classes on the total duration of the recordings.
NIPS4Bplus: a richly annotated birdsong audio dataset

November 2018

·

333 Reads

Recent advances in birdsong detection and classification have approached a limit due to the lack of fully annotated recordings. In this paper, we present NIPS4Bplus, the first richly annotated birdsong audio dataset, that is comprised of recordings containing bird vocalisations along with their active species tags plus the temporal annotations acquired for them. Statistical information about the recordings, their species specific tags and their temporal annotations are presented along with example uses. NIPS4Bplus could be used in various ecoacoustic tasks, such as training models for bird population monitoring, species classification, birdsong vocalisation detection and classification.


Automatic acoustic detection of birds through deep learning: The first Bird Audio Detection challenge

October 2018

·

1,478 Reads

·

383 Citations

Assessing the presence and abundance of birds is important for monitoring specific species as well as overall ecosystem health. Many birds are most readily detected by their sounds, and thus, passive acoustic monitoring is highly appropriate. Yet acoustic monitoring is often held back by practical limitations such as the need for manual configuration, reliance on example sound libraries, low accuracy, low robustness, and limited ability to generalise to novel acoustic conditions. Here, we report outcomes from a collaborative data challenge. We present new acoustic monitoring datasets, summarise the machine learning techniques proposed by challenge teams, conduct detailed performance evaluation, and discuss how such approaches to detection can be integrated into remote monitoring projects. Multiple methods were able to attain performance of around 88% area under the receiver operating characteristic (ROC) curve (AUC), much higher performance than previous general‐purpose methods. With modern machine learning, including deep learning, general‐purpose acoustic bird detection can achieve very high retrieval rates in remote monitoring data, with no manual recalibration, and no pretraining of the detector for the target species or the acoustic conditions in the target environment.


Citations (8)


... Few-shot sound event detection was introduced in [10], and FSBSED in [6]. In bioacoustics, detection problems are highly diverse in terms of detection targets (e.g. a given species, call type, or emotional state), and therefore methods with a fixed ontology (e.g. ...

Reference:

Synthetic data enables context-aware bioacoustic sound event detection
Learning to detect an animal sound from five examples

Ecological Informatics

... This loss of performance when increasing the complexity of the classification task is likely due to low occurrences of some sound classes, such as buzzes, for which the model did not have enough examples to train and then detect these rare sound events. Other methods like few-shot learning (Nolasco et al., 2022;Xu et al., 2021) may be more relevant for this type of rare sounds. Increasing the data quantity using data augmentation (Li et al., 2021;Padovese et al., 2021) could also be tested to improve the results (i.e., raise mean values and reduce standard deviations). ...

Few-shot bioacoustic event detection at the DCASE 2022 challenge

... However, traditional piezoelectric materials face limitations like brittleness and limited performance diversity, which significantly impact their application range and service life [7,8]. These shortcomings have led researchers to explore the concept of composite materials within the realm of piezoelectric materials [9]. ...

Piezoelectric Particulate Composite for Energy Harvesting from Mechanical Vibration

... While these datasets contain many short recordings from a wide variety of different birds, other authors have released datasets composed of fewer but longer recordings, which imitate a real wildlife scenario. Examples of this are NIPS4BPlus 16 , which contains 687 recordings summing a total of 30 hours of recordings or BirdVox-full-night 17 , which has 6 recordings of 10 hours each. ...

NIPS4Bplus: a richly annotated birdsong audio dataset

... While PAM has not been intensively used to study bird communities in the wetlands of South Africa, it has been implemented in other ecosystems around the world and proven to yield rich data sets over a relatively short period compared with traditional methods (Stowell et al. 2019). For example, PAM has been used to study the migration patterns (Koleček et al. 2020), vocal activity (Frommolt 2017), population abundance, and basic presence/absence of several bird species (Pamula et al. 2019;Duchac et al. 2020;Bota et al. 2020). This technique has also proved effective in monitoring ecosystem health using bio-indicator species (Stowell et al. 2019) and species richness/diversity indices (Klingbeil and Willig 2015). ...

Towards the Acoustic Monitoring of Birds Migrating at Night

Biodiversity Information Science and Standards

... While BirdNET performs well in regions it was trained on, its accuracy declines in unfamiliar soundscapes due to local variations in bird vocalizations, background noise, and overlap-ping bird vocalizations (Pérez-Granados, 2023). In these cases, False Positives (FPs) often arise from other vocalizing animals not being birds, anthropogenic sounds, or weather conditions (Stowell et al., 2019;Kahl et al., 2021;Clark et al., 2023). ...

Automatic acoustic detection of birds through deep learning: The first Bird Audio Detection challenge
  • Citing Article
  • October 2018

... To address the class imbalance and limited number of training files per species, the dataset is extended with recordings from previous competitions [17,18,19,20] and additional files from Xeno-canto (XC). Furthermore, soundscapes (SC) without bird activity from the DCASE 2018 Bird Detection Task [21,22] and other sources [23] are included as a 'nocall' class and for noise augmentation. Table 1 gives an overview on the individual datasets utilized. ...

Automatic acoustic detection of birds through deep learning: the first Bird Audio Detection challenge

... This is a result of the noise generated by wind turbines being masked by the noise generated by gusts of wind, making wind turbine noise less perceptible to people living near wind farms [16]. Hence, it can be claimed that living in the direct vicinity of wind turbines can significantly adversely impact humans [17]. Similarly, in the case of infra and low-frequency noise generated by wind turbines whose levels are below the perception threshold (and are about 50-70 dB [18], and even 90-100 dB for frequency 1-2 Hz [19,20], it is believed that they do not have a direct adverse effect on human health [18]. ...

POMIARY HAŁASU GENEROWANEGO PRZEZ ELEKTROWNIE WIATROWE I OCENA ICH WPŁYWU NA ŚRODOWISKO

Informatyka Automatyka Pomiary w Gospodarce i Ochronie Środowiska