Gintautas Tamulevičius

Gintautas Tamulevičius
Vilnius University · Institute of Data Science and Digital Technologies

PhD

About

31
Publications
3,720
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
181
Citations
Additional affiliations
September 2019 - present
Vilnius University
Position
  • Senior Researcher
September 2018 - present
Vilnius University
Position
  • Professor (Associate)
October 2008 - August 2019
Vilnius University Institute of Mathematics and Informatics
Position
  • Researcher
Education
October 2003 - July 2008
Vilnius Gediminas Technical University
Field of study
  • Informatics Engineering
September 2001 - June 2003
Vilnius Gediminas Technical University
Field of study
  • Electronics Engineering
September 1997 - June 2001
Vilnius Gediminas Technical University
Field of study
  • Electronics Engineering

Publications

Publications (31)
Article
Full-text available
In this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal...
Chapter
The study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms...
Article
Full-text available
During the last 10–20 years, a great deal of new ideas have been proposed to improve the accuracy of speech emotion recognition: e.g., effective feature sets, complex classification schemes, and multi-modal data acquisition. Nevertheless, speech emotion recognition is still the task in limited success. Considering the nonlinear and fluctuating natu...
Article
Full-text available
The automated identification system of vessel movements receives a huge amount of multivariate, heterogeneous sensor data, which should be analyzed to make a proper and timely decision on vessel movements. The large number of vessels makes it difficult and time-consuming to detect abnormalities, thus rapid response algorithms should be developed fo...
Article
Full-text available
Introduction: Recurrent laryngeal nerve injury is one of the major complications related to thyroid surgery. Intraoperative recurrent laryngeal nerve functional status monitoring is becoming a standard part of thyroid surgery. However, the current methods for intraoperative nerve functional status assessment are associated with a demand for specia...
Article
The aim of this study was to evaluate the suitability of 2D audio signal feature maps for speech recognition based on deep learning. The proposed methodology employs a convolutional neural network (CNN), which is a class of deep, feed-forward artificial neural network. The authors analyzed the audio signal feature maps, namely spectrograms, linear...
Article
Full-text available
The Autoregressive model-based digital inverse filtering technique is applied in noninvasive detection of vocal fold paralysis. The vocal tract filter is modelled using variable order (up to 20) AR model which is adequate to individual characteristics of human vocal properties. This postulates the more accurate estimation of the glottal flow, distu...
Conference Paper
The modeling of individual speaker’s properties is presented in this paper. The classic Autoregressive (AR) model is proposed for this purpose. The employed model order and parameter estimation technique gave much higher model order (up to 200 in some cases) in detailed spectral analysis of speech signals. Comparison of high-order AR model-based an...
Article
Full-text available
Autoregressive (AR) model is widely used for modeling of speech signals. Nevertheless, the problem of adequate modelling of Lithuanian speech is still open. The results of this study indicate the need of much higher order models for Lithuanian wovel description. Only high order AR models enable us to model frequential properties of Lithuanian vowel...
Conference Paper
Full-text available
This paper presents the experimental study of multi-stage classification based recognition of Lithuanian speech emotions. Three different criteria for feature selection were compared for this purpose: Maximal Efficiency, Minimal Cross-Correlation feature criterions, and the Sequential Feature Selection. A large database of spoken emotional Lithuani...
Article
Full-text available
The intensive research of speech emotion recognition introduced a huge collection of speech emotion features. Large feature sets complicate the speech emotion recognition task. Among various feature selection and transformation techniques for one-stage classification, multiple classifier systems were proposed. The main idea of multiple classifiers...
Conference Paper
Feature selection is very relevant for speech emotion recognition task. Still, there is no consensus on optimal feature set and classification scheme for this task. Sequential forward selection (SFS) technique for multistage emotion classification scheme is proposed in this paper. Feature sets were formed from initial collection of 6552 speech emot...
Article
The problem of speech emotion recognition commonly is dealt with by delivering a huge feature set containing up to a few thousands different features. This can raise the curse of dimensionality problem and downgrade speech emotion classification process. In this paper we present minimal cross-correlation based formation of multi-level features for...
Article
Intra-reference and inter-reference distances can be used for evaluation and comparison of reference templates in template based speech recognition. Hence decision on quality of reference template can be done and the reference set can be updated if necessary. In this paper a new reference set update technique based on analysis of vocabulary distanc...
Article
Full-text available
Various feature selection and classification schemes were proposed to improve efficiency of speech emotion classification and recognition. In this paper we propose multi-level organization of classification process and features. The main idea is to perform classification of speech emotions in step-by-step manner using different feature subsets for...
Conference Paper
Full-text available
The speed and precision of the prototype of voice control unit was optimized and it was proven that: It is possible to recognize 2.6 times more words without the loose of precision by the use of global constrains in optimized DTWc IP module. Hardware optimized isolated word matching (DTWc) is executed in 128 µs achieving ~7800 word/s comparison spe...
Article
Paper focuses on the new vocabulary renewal algorithm designed for the hardware implemented Lithuanian speech recognizer. The isolated word recognition is performed using dynamic time warping of the Mel-frequency cepstrum coefficients (MFCC) estimated during short-time analysis of speech signals. A self-organizing feature map is used to extract the...
Article
Full-text available
Speech signal is redundant and non-stationary by nature. Because of vocal tract inertness these variations are not very rapid and the signal can be considered as stationary in short segments. It is presumed that in short-time magnitude spectrum the most distinct information of speech is contained. This is the main reason for speech signal analysis...
Article
Full-text available
We consider the biggest challenge in speech recognition – noise reduction. Traditionally detected transient noise pulses are removed with the corrupted speech using pulse models. In this paper we propose to cope with the problem directly in Dynamic Time Warping domain. Bidirectional Dynamic Time Warping algorithm for the recognition of isolated wor...
Article
The article reports on the upgrading of the FPGA based isolated word recognition system for real-time tasks. All recognition system components (except some feature calculation steps) were implemented using VHDL. Some high precision calculations were implemented on soft core processor. The employed Dynamic time warping algorithm was speeded-up 2.8 t...
Conference Paper
Full-text available
Paper presents an algorithm for acceleration of the dynamic time warping (DTW) based isolated word recognition algorithm. The number of matching operations directly depends on the size of vocabulary. A set of perceptual cepstrum features is calculated for each word and stored in the vocabulary as a reference. Additionally all words (references) are...
Article
Full-text available
Paper presents visual features for speech recognition. Visual speech recognition can be applied as the support for the acoustical recognition process or as a stand-alone speech recognition approach in the case of absence of the acoustical data. Static geometrical features of the speaker's lips were used for recognition. Statistical analysis of the...
Article
Full-text available
Paper presents an comparative evaluation of features extraction algorithm for a real-time isolated word recognition system based on FPGA. The Mel-frequency cepstral, linear frequency cepstral, linear predictive and their cepstral coefficients were implemented in hardware/software design. The proposed system was investigated in speaker dependent mod...
Article
Full-text available
The article reports on the upgrading of the FPGA based isolated word recognition system for real-time tasks. All recognition system components (except some feature calculation steps) were implemented using VHDL. Some high precision calculations were implemented on soft core processor. The employed Dynamic time warping algorithm was speeded-up 2.8 t...
Article
The experiences of undergoing economic crises attest that the loss of employment prompts an outbreak of mental illnesses and suicides, increases the numbers of heart attacks and strokes and negatively affects other illnesses suffered by individuals under stress. Negative stress can devastate a person, cause depression, lower productivity on the job...
Article
The paper describes a field programmable gate array implementation of the main part of speech recognition system - feature extraction. In order to accelerate recognition the whole cepstral analysis scheme is implemented in hardware by the use of intellectual property cores. Two field programmable gate array devices are used for evaluation. Comparat...
Article
Full-text available
Enhancement of FPGA implementation of Lithuanian isolated word recognition system is presented. Software based recognizer implementation was used as the basis for enhancement. The feature extraction (as the most time required process) and local distance calculation (as the most times performed process) were selected for hardware implementation. Red...
Article
Word segmentation into phones is studied in this paper. The method of change point detection in random sequences is used for phone boundaries detection in a word. It is assumed that phones are stationary signal segments and changes of signal parameters are the boundaries of these phones. Change moments are detected maximizing change points likeliho...

Network

Cited By

Projects

Projects (2)
Project
The aim of the project is to investigate and apply AI-based methods for speech signal processing. The project is aggregated from the activities such as an investigation of different feature spaces application in phoneme, uttered word detection and classification; and the development of non-invasive methods capable of human vocal properties evaluation.