Georgios Paraskevopoulos

Georgios Paraskevopoulos
National Technical University of Athens | NTUA · Division of Signals, Control and Robotics

PhD Candidate at National Technical University of Athens

About

29
Publications
6,559
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
314
Citations

Publications

Publications (29)
Conference Paper
Full-text available
Despite the fact that in some areas of cultural life, as in the case of certain online video platforms or TV programs, notable progress has been made to provide content accessible to Deaf and Hard of Hearing people (DHH), the same cannot be said for live theater performances. In this work, a system called NLP-Theatre is presented , with the emphasi...
Article
Full-text available
Modern businesses are obligated to conform to regulations to prevent physical injuries and ill health for anyone present on a site under their responsibility, such as customers, employees and visitors. Safety officers (SOs) are engineers, who perform site audits to businesses, record observations regarding possible safety issues and make appropriat...
Preprint
Full-text available
Recent deep learning Text-to-Speech (TTS) systems have achieved impressive performance by generating speech close to human parity. However, they suffer from training stability issues as well as incorrect alignment of the intermediate acoustic representation with the input text sequence. In this work, we introduce Regotron, a regularized version of...
Preprint
Full-text available
Current deep learning approaches for multimodal fusion rely on bottom-up fusion of high and mid-level latent modality representations (late/mid fusion) or low level sensory inputs (early fusion). Models of human perception highlight the importance of top-down fusion, where high-level representations affect the way sensory inputs are perceived, i.e....
Preprint
Full-text available
In this paper, we introduce EmpBot: an end-to-end empathetic chatbot. Empathetic conversational agents should not only understand what is being discussed, but also acknowledge the implied feelings of the conversation partner and respond appropriately. To this end, we propose a method based on a transformer pretrained language model (T5). Specifical...
Preprint
Full-text available
We examine the use of linear and non-linear dimensionality reduction algorithms for extracting low-rank feature representations for speech emotion recognition. Two feature sets are used, one based on low-level descriptors and their aggregations (IS10) and one modeling recurrence dynamics of speech (RQA), as well as their fusion. We report speech em...
Preprint
Full-text available
In this work we explore Unsupervised Domain Adaptation (UDA) of pretrained language models for downstream tasks. We introduce UDALM, a fine-tuning procedure, using a mixed classification and Masked Language Model loss, that can adapt to the target domain distribution in a robust and sample efficient manner. Our experiments show that performance of...
Preprint
Full-text available
In this work we propose a machine learning model for depression detection from transcribed clinical interviews. Depression is a mental disorder that impacts not only the subject's mood but also the use of language. To this end we use a Hierarchical Attention Network to classify interviews of depressed subjects. We augment the attention layer of our...
Preprint
Full-text available
This paper presents an audio visual automatic speech recognition (AV-ASR) system using a Transformer-based architecture. We particularly focus on the scene context provided by the visual information, to ground the ASR. We extract representations for audio features in the encoder layers of the transformer and fuse video features using an additional...
Article
Full-text available
In this paper we present an approach towards real-time hand gesture recognition using the Kinect sensor, investigating several machine learning techniques. We propose a novel approach for feature extraction, using measurements on joints of the extracted skeletons. The proposed features extract angles and displacements of skeleton joints, as the lat...
Preprint
Full-text available
We present a novel view of nonlinear manifold learning using derivative-free optimization techniques. Specifically, we propose an extension of the classical multi-dimensional scaling (MDS) method, where instead of performing gradient descent, we sample and evaluate possible "moves" in a sphere of fixed radius for each point in the embedded space. A...
Article
Full-text available
In this paper we present deep-learning models that submitted to the SemEval-2018 Task~1 competition: "Affect in Tweets". We participated in all subtasks for English tweets. We propose a Bi-LSTM architecture equipped with a multi-layer self attention mechanism. The attention mechanism improves the model performance and allows us to identify salient...
Article
Full-text available
In this paper we present a deep-learning model that competed at SemEval-2018 Task 2 "Multilingual Emoji Prediction". We participated in subtask A, in which we are called to predict the most likely associated emoji in English tweets. The proposed architecture relies on a Long Short-Term Memory network, augmented with an attention mechanism, that con...
Article
Full-text available
In this paper we present two deep-learning systems that competed at SemEval-2018 Task 3 "Irony detection in English tweets". We design and ensemble two independent models, based on recurrent neural networks (Bi-LSTM), which operate at the word and character level, in order to capture both the semantic and syntactic information in tweets. Our models...
Conference Paper
We investigate the performance of features that can capture non-linear recurrence dynamics embedded in the speech signal for the task of Speech Emotion Recognition (SER). Reconstruction of the phase space of each speech frame and the computation of its respective Recurrence Plot (RP) reveals complex structures which can be measured by performing Re...
Conference Paper
In this paper we present an approach for real-time gesture recognition using the Kinect sensor and a set of machine learning techniques. We propose a novel approach for feature extraction using measurements on skeletal joints. We select a set of simple gestures and construct a data set. We train classifiers under the assumptions that they shall be...

Network

Cited By