Stefano Squartini

Stefano Squartini
Università Politecnica delle Marche | Università degli Studi di Ancona · Department of Information Engineering (DII)

PhD

About

265
Publications
58,818
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,085
Citations
Additional affiliations
January 2004 - December 2012
January 2004 - present
University of Stirling
Education
November 2002 - November 2005
Università Politecnica delle Marche
Field of study
  • Digital Signal Processing
November 1995 - March 2002
Università Politecnica delle Marche
Field of study
  • Electronic Engineering

Publications

Publications (265)
Article
Full-text available
It is a well-established practice to build a robust system for sound event detection by training supervised deep learning models on large datasets, but audio data collection and labeling are often challenging and require large amounts of effort. This paper proposes a workflow based on few-shot metric learning for emergency siren detection performed...
Preprint
Continuous speech separation (CSS) is a recently proposed framework which aims at separating each speaker from an input mixture signal in a streaming fashion. Hereafter we perform an evaluation study on practical design considerations for a CSS system, addressing important aspects which have been neglected in recent works. In particular, we focus o...
Preprint
Full-text available
Speech separation and speaker diarization have strong similarities. In particular with respect to end-to-end neural diarization (EEND) methods. Separation aims at extracting each speaker from overlapped speech, while diarization identifies time boundaries of speech segments produced by the same speaker. In this paper, we carry out an analysis of th...
Article
Full-text available
The pathway toward the reduction of greenhouse gas emissions is dependent upon increasing Renewable Energy Sources (RESs), demand response, and electrification of public and private transportation. Energy management techniques are necessary to coordinate the operation in this complex scenario, and in recent years several works have appeared in the...
Preprint
Recent work on monaural source separation has shown that performance can be increased by using fully learned filterbanks with short windows. On the other hand it is widely known that, for conventional beamforming techniques, performance increases with long analysis windows. This applies also to most hybrid neural beamforming methods which rely on a...
Preprint
This paper describes a novel Deep Learning method for the design of IIR parametric filters for automatic audio equalization. A simple and effective neural architecture, named BiasNet, is proposed to determine the IIR equalizer parameters. An output denormalization technique is used to obtain accurate tuning of the IIR filters center frequency, qual...
Article
We study the problem of detecting and counting simultaneous, overlapping speakers in a multichannel, distant-microphone scenario. Focusing on a supervised learning approach, we treat Voice Activity Detection (VAD), Overlapped Speech Detection (OSD), joint VAD and OSD (VAD+OSD) and speaker counting in a unified way, as instances of a general Overlap...
Preprint
Fully exploiting ad-hoc microphone networks for distant speech recognition is still an open issue. Empirical evidence shows that being able to select the best microphone leads to significant improvements in recognition without any additional effort on front-end processing. Current channel selection techniques either rely on signal, decoder or poste...
Chapter
Nowadays, cars host an increasing number of sensors to improve safety, efficiency and comfort. Acoustic sensors have been proposed, in recent works, to acquire information related to the road conditions. Thanks to effectiveness of Deep Learning techniques in analyzing audio data, new scenarios can be envisioned. Based on previous works employing Co...
Preprint
Full-text available
The continuously growing amount of monitored data in the Industry 4.0 context requires strong and reliable anomaly detection techniques. The advancement of Digital Twin technologies allows for realistic simulations of complex machinery, therefore, it is ideally suited to generate synthetic datasets for the use in anomaly detection approaches when c...
Article
The main goal of present work is to create a comparative evaluation methodology based on the correlation between the objective parameters of traditional acoustics and the psychoacoustic subjective parameters investigated with listening tests. This led to the definition of a quality index for kitchen hoods calculated from measurable objective parame...
Article
The Rhodes piano is an electromechanical keyboard instrument, released for the first time in 1946 and subsequently manufactured for at least four decades, reaching an iconic status and being now generally referred to as the electric piano. A few academic works discuss its operating principle and propose different physical modeling strategies; howev...
Chapter
Emergency Siren Recognition (ESR) is an important issue for automotive safety. We are interested in the early recognition of ambulance sirens in urban scenarios, where noise can be produced by a wide variety of sources and represents an impediment to the perception of alarm sounds by drivers. In this paper, we propose a deep convolutional neural ne...
Article
Full-text available
The continuously growing amount of monitored data in the Industry 4.0 context requires strong and reliable anomaly detection techniques. The advancement of Digital Twin technologies allows for realistic simulations of complex machinery, therefore, it is ideally suited to generate synthetic datasets for the use in anomaly detection approaches when c...
Article
This paper presents a novel multichannel audio equalization technique based on evolutionary computation algorithms for tuning the filters coefficients. Specifically, two distinct evolutionary algorithms are used on purpose, i.e. the Particle Swarm Optimization (PSO) and Gravitational Search Algorithm (GSA). Two alternative solutions for the definit...
Article
Full-text available
Audio equalization is an active research topic aiming at improving the audio quality of a loudspeaker system by correcting the overall frequency response using linear filters. The estimation of their coefficients is not an easy task, especially in binaural and multipoint scenarios, due to the contribution of multiple impulse responses to each liste...
Chapter
In this paper, we propose an algorithm for snoring sounds detection based on convolutional recurrent neural networks (CRNN). The log Mel energy spectrum of the audio signal is extracted from overnight recordings and is used as input to the CRNN with the aim to detect the precise onset and offset time of the sound events. The dataset used in the exp...
Chapter
The issues relating to the energy conservation and efficiency have gained a role of great importance, from the point of view of both the consumer and the energy provider. Furthermore, over the years, the infrastructures for energy distribution have undergone an ageing process, which have led to the study of the possibility in smart grids implementa...
Chapter
The recent success of Deep Neural Networks (DNN) in several application scenarios drove the scientific community to employ this paradigm also for NILM. Kelly and Knottenbelt compared three alternative DNNs: in the first, they employed a convolutional layer followed by long short-term memory (LSTM) layers to estimate the disaggregated signal from th...
Chapter
The computers are able to perform complex calculus operations in a short amount of time. However computers cannot compete with humans in dealing with: common sense, ability to recognize people, objects, sounds, comprehension of natural language, ability to learn, categorize, generalize.
Chapter
Approaches based on hidden Markov models (HMMs) have been devoted particular attention in the last years. AFAMAP (Additive Factorial Approximate Maximum a Posteriori) has been introduced in Kolter and Jaakkola to reduce the computational burden of FHMM. The algorithm bases its operation on additive and difference FHMM, and it constrains the posteri...
Book
Research on Smart Grids has recently focused on the energy monitoring issue, with the objective of maximizing the user consumption awareness in building contexts on the one hand, and providing utilities with a detailed description of customer habits on the other. In particular, Non-Intrusive Load Monitoring (NILM), the subject of this book, represe...
Chapter
In the last fifty years, the development of new technologies has enabled machines to sustain the ever increasing computational load, thus providing the implementation capability requested by real time applications. In this context, digital signal processing played an important role especially with relation to audio systems. Several approaches have...
Preprint
Full-text available
This paper describes the speaker diarization systems developed for the Second DIHARD Speech Diarization Challenge (DIHARD II) by the Speed team. Besides describing the system, which considerably outperformed the challenge baselines, we also focus on the lessons learned from numerous approaches that we tried for single and multi-channel systems. We...
Article
In the recent years, several supervised and unsupervised approaches to fall detection have been presented in the literature. These are generally based on a corpus of examples of human falls that are, though, hard to collect. For this reason, fall detection algorithms should be designed to gather as much information as possible from the few availabl...
Article
In micro-grids, distributed energy generation based on renewable sources allows reducing the fossil fuel emissions. In order to manage the limited availability of renewable sources and to meet users’ requirements, a proper scheduling of both tasks and storage activities is needed. Moreover, the difficulty of storing energy on a large scale represen...
Article
Full-text available
Non-intrusive load monitoring (NILM) is a technique to recover source appliances from only the recorded mains in a household. NILM is unidentifiable and thus a challenge problem because the inferred power value of an appliance given only the mains could not be unique. To mitigate the unidentifiable problem, various methods incorporating domain know...
Article
Full-text available
The integration of renewable energy sources together with an energy storage system into a distribution network has become essential not only to maintain continuous electricity supply but also to minimise electricity costs. The operational costs of this paradigm depend highly upon the optimal use of battery energy. This paper proposes day-ahead sche...
Article
One of the challenges in computational acoustics is the identification of models that can simulate and predict the physical behavior of a system generating an acoustic signal. Whenever such models are used for commercial applications, an additional constraint is the time to market, making automation of the sound design process desirable. In previou...
Article
Full-text available
In real-time energy management of a converter-based microgrid, it is difficult to determine optimal operating points of a storage system in order to save costs and minimise energy waste. The complexity arises due to time-varying electricity prices, stochastic energy sources and power demand. Many countries have imposed real-time electricity pricing...
Conference Paper
A novel end-to-end binaural sound localisation approach is proposed which estimates the azimuth of a sound source directly from the waveform. Instead of employing hand-crafted features commonly employed for binaural sound localisation, such as the interaural time and level difference, our end-to-end system approach uses a convolutional neural netw...
Article
The task of Speaker LOCalization (SLOC) has been the focus of numerous works in the research field, where SLOC is performed on pure speech data, re- quiring the presence of an Oracle Voice Activity Detection (VAD) algorithm. Nevertheless, this perfect working condition is not satisfied in a real world sce- nario, where employed VADs do commit error...
Article
Full-text available
Nowadays, measurement systems strongly rely on the Internet of Things paradigm, and typically involve miniaturized devices on purpose. In these devices, the computational resources and signal acquisition rates are limited in order to preserve battery life. In addition, the amount of streamed data is affected by the network capacity strictly related...
Preprint
A novel end-to-end binaural sound localisation approach is proposed which estimates the azimuth of a sound source directly from the waveform. Instead of employing hand-crafted features commonly employed for binaural sound localisation, such as the interaural time and level difference, our end-to-end system approach uses a convolutional neural netwo...
Conference Paper
The automatic detection of road conditions in next-generation vehicles is an important task that is getting increasing interest from the research community. Its main applications concern driver safety, autonomous vehicles, and in-car audio equalization. These applications rely on sensors that must be deployed following a trade-off between installat...
Article
Fault diagnosis of electric motors is a fundamental task for production line testing, and it is usually performed by experienced human operators. In the recent years, several methods have been proposed in the literature for detecting faults automatically. Deep neural networks have been successfully employed for this task, but, up to the authors ʼ k...
Article
Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings. Nowadays, deep learning offers valuable techniques for this goal such as convolutional neural networks (CNNs). The capsule neural network (CapsNet) architecture has been recently introduced in the image pro...
Preprint
Full-text available
Non-intrusive load monitoring (NILM) is a technique to recover source appliances from only the recorded mains in a household. NILM is unidentifiable and thus a challenge problem because the inferred power value of an appliance given only the mains could not be unique. To mitigate the unidentifiable problem, various methods incorporating domain know...
Article
Full-text available
A storage system is a key component of a microgrid. Over the last few years, research has been undertaken to determine optimal management of microgrid resources. Battery storage has a significant impact on the total operational cost as the lifetime of the battery reduces during charging and discharging cycles. In this paper, we propose optimal ener...
Preprint
Full-text available
Real-time energy management of a converter-based microgrid is difficult to determine optimal operating points of a storage system in order to save costs and minimise energy waste. This complexity arises due to time-varying electricity prices, stochastic energy sources and power demand. Many countries have imposed real-time electricity pricing to ef...
Chapter
In the field of multi-channel speech quality enhancement, beamforming algorithms play a key role, being able to reduce noise and reverberation by spatial filtering. To that extent, an accurate knowledge of the Direction of Arrival (DOA) is crucial for the beamforming to be effective. This paper reports extremely improved DOA estimates with the use...
Chapter
In the past few years, several works describing systems for the prompt detection of falls have been presented in literature. Many of these systems address the problem of fall detection by using some handcrafted features extracted from the input signals. In the meantime interest in the use of feature learning and deep architectures has been increasi...
Chapter
With the increasing ageing population representing a challenge for society and health care systems, solutions based on ICT to prolong the independent living of older adults become critical. Among them, systems able to automatically detect falls are being investigated since several years, because many solutions that appear promising when tested in l...
Article
Cry detection is an important facility in both residential and public environments, which can answer to different needs of both private and professional users. In this paper, we investigate the problem of cry detection in professional environments, such as Neonatal Intensive Care Units (NICUs). The aim of our work is to propose a cry detection meth...
Preprint
Real-time energy management of a converter-based microgrid is difficult to determine optimal operating points of a storage system in order to save costs and minimise energy waste. This complexity arises due to time-varying electricity prices, stochastic energy sources and power demand. Many countries have imposed real-time electricity pricing to ef...
Preprint
Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings. % environment. Nowadays, Deep Learning offers valuable techniques for this goal such as Convolutional Neural Networks (CNNs). The Capsule Neural Network (CapsNet) architecture has been recently introduced i...
Article
Full-text available
A demand response management (DRM) system is proposed here, in which a service provider determines a mutual optimal solution for the utility and the customers in a microgrid setting. Such a system may find use with a service provider interacting with the respective customers and utilities under the existence of some DRM agreements. The service prov...
Preprint
Full-text available
One of the challenges in computational acoustics is the identification of models that can simulate and predict the physical behavior of a system generating an acoustic signal. Whenever such models are used for commercial applications an additional constraint is the time-to-market, making automation of the sound design process desirable. In previous...
Conference Paper
Detecting the presence of speakers and suitably localize them in indoor environments undoubtedly represent two important tasks in the speech processing community. Several algorithms have been proposed for Voice Activity Detection (VAD) and Speaker LOCalization (SLOC) so far, while their accomplishment by means of a joint integrated model has not re...
Conference Paper
In this paper, we propose a system for rare sound event detection using a hierarchical and multi-scaled approach based on Convolutional Neural Networks (CNN). The task consists on detection of event onsets from artificially generated mixtures. Spectral features are extracted from frames of the acoustic signals, then a first event detection stage op...
Conference Paper
The amount of time an infant cries in a day helps the medical staff in the evaluation of his/her health conditions. Extracting this information requires a cry detection algorithm able to operate in environments with challenging acoustic conditions, since multiple noise sources, such as interferent cries, medical equipments, and persons may be prese...
Preprint
Full-text available
Real-time energy management of a converter-based microgrid is difficult to determine optimal operating points of a storage system in order to save costs and minimise energy waste. This complexity arises due to time-varying electricity prices, stochastic energy sources and power demand. Many countries have imposed real-time electricity pricing to ef...
Chapter
Full-text available
Supporting people in their homes is an important issue both for ethical and practical reasons. Indeed, in the recent years, the scientific community devoted particular attention to detecting human falls, since the first cause of death for elderly people is due to the consequences of a fall. In this paper, we propose a human fall classification syst...
Chapter
Full-text available
This paper focuses on employing Convolutional Neural Networks (CNN) with 3-D kernels for Voice Activity Detectors in multi-room domestic scenarios (mVAD). This technology is compared with the Multi Layer Perceptron (MLP) and interesting advancements are observed with respect to previous works of the authors. In order to approximate real-life scenar...
Conference Paper
Nowadays, the detection of human fall is a problem recognized by the entire scientific community. Methods that have good performance use human falls samples in the train set, while methods that do not use it, can only work well under certain conditions. Since examples of human falls are very difficult to retrieve, there is a strong need to develop...
Chapter
In the past few years, several works describing systems for the prompt detection of falls have been presented in literature. Many of these systems address the problem of fall detection by using some handcrafted features extracted from the input signals.In the meantime interest in the use of feature learning and deep architectures has been increasin...
Conference Paper
Non-intrusive load monitoring (NILM) is defined as the task of retrieving the active power consumption of two or more appliances from information gathered at a single metering point. In this work, the use of the reactive aggregate power as a