Manu Airaksinen

Manu Airaksinen
University of Helsinki | HY

DSc (Tech)

About

55
Publications
9,774
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
830
Citations

Publications

Publications (55)
Preprint
Full-text available
Self-supervised learning (SSL) is a data-driven learning approach that utilizes the innate structure of the data to guide the learning process. In contrast to supervised learning, which depends on external labels, SSL utilizes the inherent characteristics of the data to produce its own supervisory signal. However, one frequent issue with SSL method...
Article
Study objectives To develop a non-invasive and practical wearable method for long-term tracking of infants’ sleep. Methods An infant wearable, NAPping PAnts (NAPPA), was constructed by combining a diaper cover and a movement sensor (triaxial accelerometer and gyroscope), allowing either real-time data streaming to mobile devices or offline feature...
Article
Full-text available
Assessing infant carrying and holding (C/H), or physical infant-caregiver interaction, is important for a wide range of contexts in development research. An automated detection and quantification of infant C/H is particularly needed in long term at-home studies where development of infants’ neurobehavior is measured using wearable devices. Here, we...
Preprint
Full-text available
The recently-developed infant wearable MAIJU provides a means to automatically evaluate infants' motor performance in an objective and scalable manner in out-of-hospital settings. This information could be used for developmental research and to support clinical decision-making, such as detection of developmental problems and guiding of their therap...
Article
Full-text available
Background: Early neurodevelopmental care and research are in urgent need of practical methods for quantitative assessment of early motor development. Here, performance of a wearable system in early motor assessment was validated and compared to developmental tracking of physical growth charts. Methods: Altogether 1358 h of spontaneous movement...
Article
Full-text available
Infant motility assessment using intelligent wearables is a promising new approach for assessment of infant neurophysiological development, and where efficient signal analysis plays a central role. This study investigates the use of different end-to-end neural network architectures for processing infant motility data from wearable sensors. We focus...
Article
Background Electroencephalogram (EEG) monitoring is recommended as routine in newborn neurocritical care to facilitate early therapeutic decisions and outcome predictions. EEG's larger-scale implementation is, however, hindered by the shortage of expertise needed for the interpretation of spontaneous cortical activity, the EEG background. We develo...
Preprint
Full-text available
When domain experts are needed to perform data annotation for complex machine-learning tasks, reducing annotation effort is crucial in order to cut down time and expenses. For cases when there are no annotations available, one approach is to utilize the structure of the feature space for clustering-based active learning (AL) methods. However, these...
Article
Full-text available
Background Early neurodevelopmental care needs better, effective and objective solutions for assessing infants’ motor abilities. Novel wearable technology opens possibilities for characterizing spontaneous movement behavior. This work seeks to construct and validate a generalizable, scalable, and effective method to measure infants’ spontaneous mot...
Preprint
Full-text available
Infant motility assessment using intelligent wearables is a promising new approach for assessment of infant neurophysiological development, and where efficient signal analysis plays a central role. This study investigates the use of different end-to-end neural network architectures for processing infant motility data from wearable sensors. We focus...
Article
Full-text available
Neonatal brain monitoring in the neonatal intensive care units (NICU) requires a continuous review of the spontaneous cortical activity, i.e., the electroencephalograph (EEG) background activity. This needs development of bedside methods for an automated assessment of the EEG background activity. In this paper, we present development of the key com...
Article
Full-text available
Objective To develop a non-invasive and clinically practical method for a long-term monitoring of infant sleep cycling in the intensive care unit. Methods Forty three infant polysomnography recordings were performed at 1–18 weeks of age, including a piezo element bed mattress sensor to record respiratory and gross-body movements. The hypnogram sco...
Article
Full-text available
Infants’ spontaneous and voluntary movements mirror developmental integrity of brain networks since they require coordinated activation of multiple sites in the central nervous system. Accordingly, early detection of infants with atypical motor development holds promise for recognizing those infants who are at risk for a wide range of neurodevelopm...
Preprint
Full-text available
Infants' spontaneous and voluntary movements mirror developmental integrity of brain networks since they require coordinated activation of multiple sites in the central nervous system. Accordingly, early detection of infants with atypical motor development holds promise for recognizing those infants who are at risk for a wide range of neurodevelopm...
Conference Paper
Full-text available
This study explores various speech data augmentation methods for the task of noise-robust fundamental frequency (F0) estimation with neural networks. The explored augmentation strategies are split into additive noise and channel-based augmentation and into vocoder-based augmentation methods. In vocoder-based augmentation , a glottal vocoder is used...
Article
In this article, three adaptation methods are compared based on how well they change the speaking style of a neural network based text-to-speech (TTS) voice. The speaking style conversion adopted here is from normal to Lombard speech. The selected adaptation methods are: auxiliary features (AF), learning hidden unit contribution (LHUC), and fine-tu...
Article
Full-text available
Glottal inverse filtering (GIF) refers to technology to estimate the source of voiced speech, the glottal flow, from speech signals. When a new GIF algorithm is proposed, its accuracy needs to be evaluated. However, the evaluation of GIF is problematic because the ground truth, the real glottal volume velocity signal generated by the vocal folds, c...
Article
Estimation of glottal source information can be performed non-invasively from speech by using glottal inverse filtering (GIF) methods. However, the existing GIF methods are sensitive even to slight distortions in speech signals under different realistic scenarios, for example, in coded telephone speech. Therefore, there is a need for robust GIF met...
Conference Paper
Full-text available
Feature extraction of speech signals is typically performed in short-time frames by assuming that the signal is stationary within each frame. For the extraction of the spectral envelope of speech, which conveys the formant frequencies produced by the resonances of the slowly varying vocal tract, an often used frame length is within 20-30 ms. Howeve...
Article
A vocoder is used to express a speech waveform with a controllable parametric representation that can be converted back into a speech waveform. Vocoders representing their main categories (mixed excitation, glottal, sinusoidal vocoders) were compared in this study with formal and crowd-sourced listening tests. Vocoder quality was measured within th...
Preprint
Full-text available
Recent speech technology research has seen a growing interest in using WaveNets as statistical vocoders, i.e., generating speech waveforms from acoustic features. These models have been shown to improve the generated speech quality over classical vocoders in many tasks, such as text-to-speech synthesis and voice conversion. Furthermore, conditionin...
Article
Full-text available
This paper proposes a method for generating speech from filterbank mel frequency cepstral coefficients (MFCC), which are widely used in speech applications, such as ASR, but are generally considered unusable for speech synthesis. First, we predict fundamental frequency and voicing information from MFCCs with an autoregressive recurrent neural net....
Article
Recently, a quasi-closed phase (QCP) analysis of speech signals for accurate glottal inverse filtering was proposed. However, the QCP analysis which belongs to the family of temporally weighted linear prediction (WLP) methods uses the conventional forward type of sample prediction. This may not be the best choice especially in computing WLP models...
Article
Full-text available
A new method is proposed for solving the glottal inverse filtering (GIF) problem. The goal of GIF is to separate an acoustical speech signal into two parts: the glottal airflow excitation and the vocal tract filter. To recover such information one has to deal with a blind deconvolution problem. This ill-posed inverse problem is solved under a deter...
Article
Linear prediction (LP) is a prevalent source-filter separation method of speech production. One of the drawbacks of conventional LP-based approaches is the biasing of estimated formants by harmonic peaks. Methods such as discrete all-pole modeling and weighted LP have been proposed to overcome this problem, but they all use a linear frequency scale...
Article
This study proposes an approach for glottal inverse filtering of acoustic speech signals using quadratic programming (QPR). The method aims to jointly model the effect of vocal tract and lip radiation with a single filter whose coefficients are optimized using QPR. This optimization is based on the principles of closed phase analysis, where the con...
Conference Paper
Full-text available
Achieving high quality and naturalness in statistical parametric synthesis of female voices remains to be difficult despite recent advances in the study area. Vocoding is one such key element in all statistical speech synthesizers that is known to affect the syn- thesis quality and naturalness. The present study focuses on a spe- cial type of vocod...
Conference Paper
In the analysis of speech production, glottal inverse filtering has proved to be an effective yet non-invasive method for obtaining information about the voice source. One of the main challenges of the existing methods is blind estimation of the contribution of the lip radiation, which must often be manually determined. To obtain a fully automatic...
Conference Paper
Parameterization of the glottal flow is a process where the glottal flow is represented in terms of a few numerical values. This study proposes a novel parameterization technique called the phase plane symmetry (PPS) parameter that utilizes the symmetrical properties of the phase plane plot. Phase plane is a way to graphically visualize the glottal...
Article
This study presents a new glottal inverse filtering (GIF) technique based on closed phase analysis over multiple fundamental periods. The proposed quasi closed phase (QCP) analysis method utilizes weighted linear prediction (WLP) with a specific attenuated main excitation (AME) weight function that attenuates the contribution of the glottal source...
Conference Paper
Full-text available
This study presents a new glottal inverse filtering (GIF) technique based on the closed phase analysis over multiple fundamental periods. The proposed Quasi Closed Phase Analysis (QCP) method utilizes Weighted Linear Prediction (WLP) with a specific Attenuated Main Excitation (AME) weighting function that attenuates the contribution of the glottal...
Conference Paper
Full-text available
In this study, the acoustic properties of shouted speech are ana-lyzed in relation to normal speech, and various synthesis tech-niques for shouting are investigated. The analysis shows large differences between the two styles, which induces difficulties in synthesis. Analysis-synthesis experiments show that the use of spectral estimation methods th...
Article
This paper presents a new glottal inverse filtering (GIF) method that utilizes a Markov chain Monte Carlo (MCMC) algorithm. First, initial estimates of the vocal tract and glottal flow are evaluated by an existing GIF method, iterative adaptive inverse filtering (IAIF). Simultaneously, the initially estimated glottal flow is synthesized using the R...

Network

Cited By