Gaowen Liu

Gaowen Liu
Università degli Studi di Trento | UNITN · Department of Information Engineering and Computer Science

About

22
Publications
2,613
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
785
Citations

Publications

Publications (22)
Preprint
Full-text available
Optical flow estimation in omnidirectional videos faces two significant issues: the lack of benchmark datasets and the challenge of adapting perspective video-based methods to accommodate the omnidirectional nature. This paper proposes the first perceptually natural-synthetic omnidirectional benchmark dataset with a 360-degree field of view, FLOW36...
Preprint
Full-text available
Cross-view video synthesis task seeks to generate video sequences of one view from another dramatically different view. In this paper, we investigate the exocentric (third-person) view to egocentric (first-person) view video generation task. This is challenging because egocentric view sometimes is remarkably different from the exocentric view. Thus...
Article
Full-text available
This paper introduces a new metamodel-based knowledge representation that significantly improves autonomous learning and adaptation. While interest in hybrid machine learning / symbolic AI systems leveraging, for example, reasoning and knowledge graphs, is gaining popularity, we find there remains a need for both a clear definition of knowledge and...
Preprint
This paper introduces a new metamodel-based knowledge representation that significantly improves autonomous learning and adaptation. While interest in hybrid machine learning / symbolic AI systems leveraging, for example, reasoning and knowledge graphs, is gaining popularity, we find there remains a need for both a clear definition of knowledge and...
Preprint
Cross-view image generation has been recently proposed to generate images of one view from another dramatically different view. In this paper, we investigate exocentric (third-person) view to egocentric (first-person) view image generation. This is a challenging task since egocentric view sometimes is remarkably different from exocentric view. Thus...
Conference Paper
In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C2GAN) for the task of keypoint-guided image generation. The proposed C2GAN is a cross-modal framework exploring a joint exploitation of the keypoint and the image data in an interactive manner. C2GAN contains two different types of generators, i.e., keypoint-oriented g...
Preprint
In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C$^2$GAN) for the task of keypoint-guided image generation. The proposed C$^2$GAN is a cross-modal framework exploring a joint exploitation of the keypoint and the image data in an interactive manner. C$^2$GAN contains two different types of generators, i.e., keypoint-o...
Article
Full-text available
Recognizing human activities from videos is a fundamental research problem in computer vision. Recently, there has been a growing interest in analyzing human behavior from data collected with wearable cameras. First-person cameras continuously record several hours of their wearers' life. To cope with this vast amount of unlabeled and heterogeneous...
Article
Recently, head pose estimation (HPE) from low-resolution surveillance data has gained in importance. However, monocular and multi-view HPE approaches still work poorly under target motion, as facial appearance distorts owing to camera perspective and scale changes when a person moves around. To this end, we propose FEGA-MTL, a novel framework based...
Article
Supervised learning methods require sufficient labeled examples to learn a good model for classification or regression. However, available labeled data are insufficient in many applications. Active learning (AL) and domain adaptation (DA) are two strategies to minimize the required amount of labeled data for model training. AL requires the domain e...
Article
Full-text available
Complex event detection is a retrieval task with the goal of finding videos of a particular event in a large-scale unconstrained internet video archive, given example videos and text descriptions. Nowadays, different multimodal fusion schemes of low-level and high-level features are extensively investigated and evaluated for the complex event detec...
Article
Sparse coding was shown to be able to find succinct representations of stimuli. Recently, it has been successfully applied to a variety of problems in image processing analysis. Sparse coding models data vectors as a linear combination of a few elements from a dictionary. However, most existing sparse coding methods are applied for a single task on...
Conference Paper
The widespread adoption of low-cost wearable devices requires novel paradigms for analysing human behaviour. In particular, when focusing on first-person cameras continuously recording several hours of the users life, the task of activity recognition is especially challenging. As a huge amount of unlabeled data is automatically generated in this sc...
Article
Robust action recognition under viewpoint changes has received considerable attention recently. To this end, Self- Similarity Matrices (SSMs) have been found to be effective view-invariant action descriptors. To enhance the performance of SSM-based methods, we propose Multi-task LDA, a novel multi-task learning framework for multi-view action recog...
Article
The selection of discriminative features is an important and effective technique for many computer vision and multimedia tasks. Using irrelevant features in classification or clustering tasks could deteriorate the performance. Thus, designing efficient feature selection algorithms to remove the irrelevant features is a possible way to improve the c...
Conference Paper
Event detection from real surveillance videos with complicated background environment is always a very hard task. Different from the traditional retrospective and interactive systems designed on this task, which are mainly executed on video fragments located within the event-occurrence time, in this paper we propose a new interactive system constru...
Conference Paper
Multimedia event detection (MED) is a retrieval task with the goal of finding videos of a particular event in a large scale internet video archive, given example videos and text descriptions. Nowadays, different multimodal fusion schemes of low-level and high-level features are extensively investigated and evaluated for MED. For most of events in M...
Conference Paper
The selection of discriminative features is an important and effective technique for many multimedia tasks. Using irrelevant features in classification or clustering tasks could deteriorate the performance. Thus, designing efficient feature selection algorithms to remove the irrelevant features is a possible way to improve the classification or clu...

Network

Cited By