Evaggelos Spyrou

Evaggelos Spyrou
National Center for Scientific Research Demokritos | ncsr · Insititute of Informatics and Telecommunications

PhD

About

141
Publications
58,495
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,574
Citations
Additional affiliations
January 2015 - July 2015
Greek National Ministry of Defense
Position
  • Instructor
October 2011 - September 2012
University of Western Macedonia
Position
  • Lecturer (on contract)
March 2014 - present
National Center for Scientific Research Demokritos
Position
  • PostDoc Position
Education
October 2004 - December 2009
National Technical University of Athens
Field of study
  • Electrical and Computer Engineering
September 1998 - November 2003
National Technical University of Athens
Field of study
  • Electrical and Computer Engineering

Publications

Publications (141)
Article
Full-text available
During the last few years, several technological advances have led to an increase in the creation and consumption of audiovisual multimedia content. Users are overexposed to videos via several social media or video sharing websites and mobile phone applications. For efficient browsing, searching, and navigation across several multimedia collections...
Article
Full-text available
In real-life scenarios, Human Activity Recognition (HAR) from video data is prone to occlusion of one or more body parts of the human subjects involved. Although it is common sense that the recognition of the majority of activities strongly depends on the motion of some body parts, which when occluded compromise the performance of recognition appro...
Article
Full-text available
The presence of occlusion in human activity recognition (HAR) tasks hinders the performance of recognition algorithms, as it is responsible for the loss of crucial motion data. Although it is intuitive that it may occur in almost any real-life environment, it is often underestimated in most research works, which tend to rely on datasets that have b...
Article
Full-text available
The problem of human activity recognition (HAR) has been increasingly attracting the efforts of the research community, having several applications. It consists of recognizing human motion and/or behavior within a given image or a video sequence, using as input raw sensor measurements. In this paper, a multimodal approach addressing the task of vid...
Article
Full-text available
Partial domain adaptation (PDA) is a framework for mitigating the covariate shift problem when target labels are contained in source labels. For this task, adversarial neural network (ANN) methods proposed in the literature have been proven to be flexible and effective. In this work, we adapt such methods to tackle the more general problem of open-...
Conference Paper
Full-text available
Cultural tourism is a right for everyone, including people with disabilities. In this work we present a novel wearable system aiming to enhance the user experiences of individuals visiting outdoor cultural environments, such as the Historic Triangle of Athens. The developed system is based on Artificial Intelligence to offer enhanced touristic expe...
Chapter
Full-text available
One of the major challenges in Human Activity Recognition (HAR) using cameras, is occlusion of one or more body parts. However, this problem is often underestimated in contemporary research works, wherein training and evaluation is based on datasets shot under laboratory conditions, i.e., without some kind of occlusion. In this work we propose an a...
Conference Paper
Contemporary human activity recognition approaches are heavily based on deep neural network architectures, since the latter do not require neither significant domain knowledge, nor complex algorithms for feature extraction, while they are able to demonstrate strong performance. Therefore, handcrafted features are nowadays rarely used. In this paper...
Conference Paper
The recognition of the emotions of humans is crucial for various applications related to human-computer interaction or for understanding the users’ mood in several tasks. Typical machine learning approaches used towards this goal first extract a set of linguistic features from raw data, which are then used to train supervised learning models. Recen...
Article
Full-text available
Monitoring driving behaviour is important in controlling driving risk, fuel consumption, and CO2 emissions. Recent advances in machine learning, which include several variants of convolutional neural networks (CNNs), and recurrent neural networks (RNNs), such as long short-term memory (LSTM) and gated recurrent unit (GRU) networks, could be valuabl...
Article
Full-text available
Monitoring driving behaviour is important in controlling driving risk, fuel consumption, and CO2 emissions. Recent advances in machine learning, which include several variants of convolutional neural networks (CNNs), and recurrent neural networks (RNNs), such as long short-term memory (LSTM) and gated recurrent unit (GRU) networks, could be valuabl...
Chapter
Full-text available
The problem of human activity recognition (HAR) has been increasingly attracting the efforts of the research community, having several applications. In this paper we propose a multi-modal approach addressing the task of video-based HAR. Our approach uses three modalities, i.e., raw RGB video data, depth sequences and 3D skeletal motion data. The la...
Article
Full-text available
The exponential growth of user-generated content has increased the need for efficient video summarization schemes. However, most approaches underestimate the power of aural features, while they are designed to work mainly on commercial/professional videos. In this work, we present an approach that uses both aural and visual features in order to cre...
Chapter
Emotion recognition (ER) has drawn the interest of many researchers in the field of human-computer interaction, being central in such applications as assisted living and personalized content suggestion. When considering the implementation of ER capable systems, if they are to be widely adopted in daily life, one must take into account that methods...
Article
Full-text available
The collection of video data for action recognition is very susceptible to measurement bias; the equipment used, camera angle and environmental conditions are all factors that majorly affect the distribution of the collected dataset. Inevitably, training a classifier that can successfully generalize to new data becomes a very hard problem, since it...
Conference Paper
The classification of driving behaviour is important for monitoring driving risk and fuel efficiency, as well as for providing a personalized view, or ’fingertip’, of each driver, useful in driving assistance and car insurance industry. Intuitively, an aggressive driving style manifests itself in the long run, with distinct frequencies of occurrenc...
Conference Paper
In this work we present an approach for the classification of driving behaviour using Convolutional Neural Networks (CNNs), based on measurements that have been obtained by the internal CAN-bus of the vehicle. As is the case with different driving behaviours, CAN-bus sensor data reflect the driving patterns associated with different types of vehicl...
Article
Full-text available
Recent advances in big data systems and databases have made it possible to gather raw unlabeled data at unprecedented rates. However, labeling such data constitutes a costly and timely process. This is especially true for video data, and in particular for human activity recognition (HAR) tasks. For this reason, methods for reducing the need of labe...
Conference Paper
The classification of driving behaviour is important for monitoring driving risk and fuel efficiency, as well as for adaptive driving assistance and car insurance industry. Starting from raw measurements of acceleration and speed, as provided by a telematics device placed on each vehicle, we define features summarizing instantaneous, short-term and...
Article
Full-text available
Several studies have addressed the problem of abnormality detection in medical images using computer-based systems. The impact of such systems in clinical practice and in the society can be high, considering that they can contribute to the reduction of medical errors and the associated adverse events. Today, most of these systems are based on binar...
Article
Full-text available
Heart rate constitutes one of the most important physiological parameters for humans and is linked to the prognosis of several diseases. Moreover, its unobtrusive measurement is necessary in several real-life applications, such as remote monitoring of elderly or infants, or even whiledriving or physical exercising. Most state-of-the art methods are...
Article
Full-text available
In this paper we present an approach toward human action detection for activities of daily living (ADLs) that uses a convolutional neural network (CNN). The network is trained on discrete Fourier transform (DFT) images that result from raw sensor readings, i.e., each human action is ultimately described by an image. More specifically, we work using...
Article
Full-text available
Currently, in all augmented reality (AR) or virtual reality (VR) educational experiences, the evolution of the experience (game, exercise or other) and the assessment of the user’s performance are based on her/his (re)actions which are continuously traced/sensed. In this paper, we propose the exploitation of the sensors available in the AR/VR syste...
Conference Paper
Full-text available
Digital culture is a mainstay of the emerging 6V era as both the digitization of existing cultural data and the creation of original native digital cultural content directly lead to the generation of large data volumes which may well be bursty, unstructured or semi-structured, and inherently multimodal. One way to address the increased complexity a...
Conference Paper
Full-text available
In this work we aim to evaluate the user experience with a state-of-the-art commercial brain-computer interface (BCI) device, namely the Neu-rosky Mindwave Mobile. This device is able to measure certain types of brain waves and translate them into quantitative measurements of users' attention and meditation, using a single, non-invasive electrode....
Conference Paper
Full-text available
Emotion recognition from speech signals is an important field in its own right as well as a mainstay of many multimodal sentiment analysis systems. The latter may as well include a broad spectrum of modalities which are strongly associated with consciously or subconsciously communicating human emotional state such as visual cues, gestures, body pos...
Conference Paper
Full-text available
In this paper we present an approach for the recognition of human activity that combines handcrafted features from 3D skeletal data and contextual features learnt by a trained deep Convolutional Neural Network (CNN). Our approach is based on the idea that contextual features, i.e., features learnt in a similar problem are able to provide a diverse...
Chapter
Visual impairment restricts everyday mobility and limits the accessibility of places, which for the non-visually impaired is taken for granted. A short walk to a close destination, such as a market or a school becomes an everyday challenge. In this chapter, we present a novel solution to this problem that can evolve into an everyday visual aid for...
Conference Paper
Recurrent neural networks are an obvious choice for driving behavior analysis by means of time series of measurements, obtained either from telematics or mobile phone sensors. This work investigates such an application, employing two popular recurrent neural networks, i.e. long short-term memory networks and gated recurrent unit networks, as well a...
Article
Full-text available
In recent years, following the tremendous growth of the Web, extremely large amounts of digital multimedia content are being produced every day and are shared online mainly through several newly emerged channels, such as social networks [...]
Conference Paper
Full-text available
Tensor algebra is the next evolutionary step of linear algebra to more than two dimensions. Its plethora of applications include signal processing, big data, deep learning, multivariate numerical analysis, information retrieval, and social media analysis. As is precisely the case with data matrices , decompositions and factorizations with special p...
Conference Paper
Full-text available
In this paper we present preliminary results of an approach for understanding human actions, based on a novel 2D image representation for 3D skeletal data. More specifically, motion information for human skeletal joints is transformed to a pseudo-colored image. A Convolutional Neural Network is then used for classification. Our approach is evaluate...
Article
Full-text available
In this paper we present an approach towards real-time hand gesture recognition using the Kinect sensor, investigating several machine learning techniques. We propose a novel approach for feature extraction, using measurements on joints of the extracted skeletons. The proposed features extract angles and displacements of skeleton joints, as the lat...
Chapter
Full-text available
In this paper we present an approach for the recognition of human actions targeting at activities of daily living (ADLs). Skeletal information is used to create images capturing the motion of joints in the 3D space. These images are then transformed to the spectral domain using 4 well-known image transforms. A deep Convolutional Neural Network is t...
Conference Paper
Full-text available
Fibonacci numbers appear in numerous engineering and computing applications including population growth models, software engineering, task management, and data structure analysis. This mandates a computationally efficient way for generating a long sequence of successive Fibonacci integers. With the advent of GPU computing and the associated special...
Chapter
Full-text available
This paper proposes a method for recognizing audio events in urban environments that combines handcrafted audio features with a deep learning architectural scheme (Convolutional Neural Networks, CNNs), which has been trained to distinguish between different audio context classes. The core idea is to use the CNNs as a method to extract context-aware...
Article
Full-text available
It is noteworthy nowadays that monitoring and understanding a human’s emotional state plays a key role in the current and forthcoming computational technologies. On the other hand, this monitoring and analysis should be as unobtrusive as possible, since in our era the digital world has been smoothly adopted in everyday life activities. In this fram...
Poster
Full-text available
Pervasive (ubiquitous) computing is a research area whose principle is to embed some kind of computational power (i.e., using microprocessors) into daily life objects, in an effort to make them capable to communicate and perform tasks without the need of intense interaction with users. The concept of pervasive computing has recently emerged; a larg...
Article
Full-text available
Wireless Capsule Endoscopy (WCE) is a noninvasive diagnostic technique enabling the inspection of the whole gastrointestinal (GI) tract by capturing and wirelessly transmitting thousands of color images. Proprietary software “stitches” the images into videos for examination by accredited readers. However, the videos produced are of large length and...
Conference Paper
Full-text available
Modern Information and Communication Technologies (ICT) have evolved into an ever developing digital ecosystem. In such a sophisticated system information is typically transmitted through interconnected software and hardware structures, better described as the well-known concept of Internet of Things (IoT). Embedded physical devices are producing d...
Conference Paper
A novel approach for feature learning using deep learning is presented. More specifically, a Convolutional Neural Network that is trained using feature correspondences learns to map a given image patch to a descriptor. Therefore, descriptors are directly learned from examples instead of being hand-crafted. The proposed approach is evaluated in a ch...
Conference Paper
In this paper we present an approach for speaker verification, based on the the extraction of deep features. More specifically, we propose a scheme that is based on a convolutional neural network. For audio representation we opt for spectrograms, i.e., images that result from the spectral content of voices. Our network is trained to extract visual...
Conference Paper
Full-text available
We present a user evaluation of 3 unobtrusive methods for heart-rate measurement. More specifically, we implement a state-of-the-art method that uses the web camera of a typical computer, we use a low-cost bracelet with an integrated photoplethysmography sensor and also a freely available Android mobile app which uses the phone's camera and flash....
Conference Paper
Full-text available
In this position paper we present an approach for the recognition of emotions from speech. Our goal is to understand the affective state of learners upon a learning process. We propose an approach that uses visual representations of the spectrum of audio segments, which are classified using the Bag-of-Visual Words model. Our approach is applied on...
Conference Paper
This paper introduces an end-to-end solution for dynamic adaptation of the learning experience for learners of different personal needs, based on their behavioural and affective reaction to the learning activities. Personal needs refer to what learner already know, what they need to learn, their intellectual and physical capacities and their learni...
Article
Full-text available
We live in an era where typical measures towards the mitigation of environmental degradation follow the identification and recording of natural parameters closely associated with it. In addition, current scientific knowledge on the one hand may be applied to minimize the environmental impact of anthropogenic activities, whereas informatics on the o...
Chapter
Full-text available
Emotion recognition plays an important role in several applications, such as human computer interaction and understanding affective state of users in certain tasks, e.g., within a learning process, monitoring of elderly, interactive entertainment etc. It may be based upon several modalities, e.g., by analyzing facial expressions and/or speech, usin...
Article
Wireless capsule endoscopy (WCE) is performed with a miniature swallowable endoscope enabling the visualization of the whole gastrointestinal (GI) tract. One of the most challenging problems in WCE is the localization of the capsule endoscope (CE) within the GI lumen. Contemporary, radiation-free localization approaches are mainly based on the use...
Article
Full-text available
The rise of the social networks during the last few years has provided a vast amount of knowledge in several domains. Among them, route planning and point-of-interest recommendation have significantly benefited. Seen from the side of a tourist, they consist two challenging and time-consuming tasks since they may rely on many parameters and are limi...
Conference Paper
Several computer-based medical systems have been proposed for automatic detection of abnormalities in a variety of medical imaging domains. The majority of these systems are based on binary supervised classification algorithms capable of discriminating abnormal from normal image patterns. However, this approach usually does not take into account th...
Article
Full-text available
Emotion recognition from speech may play a crucial role in many applications related to human–computer interaction or understanding the affective state of users in certain tasks, where other modalities such as video or physiological parameters are unavailable. In general, a human’s emotions may be recognized using several modalities such as analyzi...