About
409
Publications
54,226
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,968
Citations
Introduction
Current institution
Additional affiliations
October 2001 - October 2015
Publications
Publications (409)
Diffusion models have achieved remarkable results in image generation. However, due to the slow convergence speed, room for enhancement remains in existing loss weight strategies. In one aspect, the predefined loss weight strategy based on signal-to-noise ratio (SNR) transforms the diffusion process into a multi-objective optimization problem. Howe...
This paper introduces the DERCo (Dublin EEG-based Reading Experiment Corpus), a language resource combining electroencephalography (EEG) and next-word prediction data obtained from participants reading narrative texts. The dataset comprises behavioral data collected from 500 participants recruited through the Amazon Mechanical Turk online crowd-sou...
Many lifelog retrieval systems have been introduced that apply various approaches to their search engines. The traditional method was to match concepts, which are visual objects detected in images and semantic queries. This concept-based approach has been applied in many retrieval systems, achieving the top performance in lifelog search challenges....
Supporting Question Answering (QA) tasks is the next step for lifelog retrieval systems, similar to the progression of the parent field of information retrieval. In this paper, we propose a new pipeline to tackle the QA task in the context of lifelogging, which is based on the open-domain QA pipeline. We incorporate this pipeline into a multimodal...
Recent years have seen the increasing popularity of e-commerce platforms which have changed the shopping behaviour of customers. Valuable data from products, customers, and purchases on such e-commerce platforms enable the delivery of personalized shopping experiences, customer targeting, and product recommendations. We introduce a novel Vietnamese...
ViewsInsight revolutionizes video content retrieval with its comprehensive suite of AI-powered features, enabling users to locate relevant videos using a variety of query types effortlessly. Its intelligent query description rewriting capability ensures precise video matching, while the visual example generation feature provides a powerful tool for...
In this paper, we present an interactive video retrieval system named VideoCLIP 2.0 developed for the Video Browser Showdown in 2024. Building upon the foundation of the previous year’s system, VideoCLIP, this upgraded version incorporates several enhancements to support novice users in solving retrieval tasks quickly and effectively. Firstly, the...
This paper conducts a thorough examination of the 12th Video Browser Showdown (VBS) competition, which is a well-established international benchmarking campaign for interactive video search systems. The annual VBS competition has witnessed a steep rise in the popularity of multimodal embedding-based approaches in interactive video retrieval. The ma...
Video Question Answering (VideoQA) is a challenging task that requires the model to understand the complex nature of video data and the variety of questions that can be asked about them. Existing approaches often suffer from the problem of ambiguous answer candidates with low relevance to the visual and auditory part of the video, which limits the...
This paper presents findings of the eleventh Video Browser Showdown competition, where sixteen teams competed in known-item and ad-hoc search tasks. Many of the teams utilized state-of-the-art video retrieval approaches that demonstrated high effectiveness in challenging search scenarios. In this paper, a broad survey of all utilized approaches is...
Video-language learning has attracted significant attention in the fields of multimedia, computer vision and natural language processing in recent years. One of the key challenges in this area is how to effectively integrate visual and linguistic information to enable machines to understand video content and query information. In this work, we leve...
BACKGROUND
Globally, heart failure (HF) affects more than 64 million people and attempts to reduce its social and economic burden is a public health priority. Interventions to support people with HF to self-manage have been shown to reduce hospitalisations, improve quality of life, and reduce mortality rates. Understanding how people self-manage is...
Background
Globally, heart failure (HF) affects more than 64 million people, and attempts to reduce its social and economic burden are a public health priority. Interventions to support people with HF to self-manage have been shown to reduce hospitalizations, improve quality of life, and reduce mortality rates. Understanding how people self-manage...
Lifelogging is a form of personal data collection which seeks to capture the totality of one’s experience through intelligent technology and sensors. Yet despite notable advancement in such technologies, there remain persistent challenges to developing interactive systems to analyse the types of large-scale personal collections often generated by l...
Organising and preprocessing are crucial steps in order to perform analysis on lifelogs. This paper presents a method for preprocessing, enriching, and segmenting lifelogs based on GPS trajectories and images captured from wearable cameras. The proposed method consists of four components: data cleaning, stop/trip point classification, post-processi...
The COVID-19 pandemic has brought significant changes across society. This Delphi study aimed to gain expert consensus on challenges faced and resource needs for autistic children during the COVID-19 pandemic. Round 1 of the Delphi method employed semi-structured interviews with experts (N = 24) which were thematically analysed in order to identify...
Recent years have witnessed an increasing amount of dialogue/conversation on the web especially on social media. That inspires the development of dialogue-based retrieval, in which retrieving videos based on dialogue is of increasing interest for recommendation systems. Different from other video retrieval tasks, dialogue-to-video retrieval uses st...
Recent years have witnessed an increasing amount of dialogue/conversation on the web especially on social media. That inspires the development of dialogue-based retrieval, in which retrieving videos based on dialogue is of increasing interest for recommendation systems. Different from other video retrieval tasks, dialogue-to-video retrieval uses st...
Many models have been proposed for vision and language tasks, especially the image-text retrieval task. State-of-the-art (SOTA) models in this challenge contain hundreds of millions of parameters. They also were pretrained on large external datasets that have been proven to significantly improve overall performance. However, it is not easy to propo...
Many models have been proposed for vision and language tasks, especially the image-text retrieval task. All state-of-the-art (SOTA) models in this challenge contained hundreds of millions of parameters. They also were pretrained on a large external dataset that has been proven to make a big improvement in overall performance. It is not easy to prop...
The Lifelog Search Challenge (LSC) is an interactive benchmarking evaluation workshop for lifelog retrieval systems. The challenge was first organised in 2018 aiming to find the system that can quickly retrieve relevant lifelog images for a given semantic query. This paper provides an analysis of the performance of all 17 systems participating in t...
Stress is a complex issue with wide ranging physical and psychological impacts on human daily performance. Specifically, acute stress detection is becoming a valuable application in contextual human understanding. Two common approaches to training a stress detection model are subject-dependent and subject-independent training method. Although subje...
For the fifth time since 2018, the Lifelog Search Challenge (LSC) facilitated a benchmarking exercise to compare interactive search systems designed for multimodal lifelogs. LSC'22 attracted nine participating research groups who developed interactive lifelog retrieval systems enabling fast and effective access to lifelogs. The systems competed in...
NTCIR-16 saw the fourth edition of the Lifelog task, which aimed to foster comparative benchmarking of approaches to automatic and interactive information retrieval from multimodal lifelog archives. In this paper, we describe the test collection employed, along with the tasks, the submissions and the findings from this NTCIR16 Lifelog-4 LEST sub-ta...
We have witnessed the rise of cross-data against multimodal data problems recently. The cross-modal retrieval system uses a textual query to look for images; the air quality index can be predicted using lifelogging images; the congestion can be predicted using weather and tweets data; daily exercises and meals can help to predict the sleeping quali...
Stress is a complex issue with wide-ranging physical and psychological impacts on human daily performance. Specifically, acute stress detection is becoming a valuable application in contextual human understanding. Two common approaches to training a stress detection model are subject-dependent and subject-independent training methods. Although subj...
Video retrieval systems have a wide range of applications across multiple domains, therefore the development of user-friendly and efficient systems is necessary. For VBS 2022, we develop a flexible interactive system for video retrieval, namely V-FIRST, that supports two scenarios of usage: query with text descriptions and query with visual example...
Recollecting details from lifelog data involves a higher level of granularity and reasoning than a conventional lifelog retrieval task. Investigating the task of Question Answering (QA) in lifelog data could help in human memory recollection, as well as improve traditional lifelog retrieval systems. However, there has not yet been a standardised be...
Exploring video clips in a vast collection of videos is a difficult task. It is necessary to provide an efficient system for users to express the information need for sought events in that video collection. Thus, we propose to develop AVSeeker – an active video retrieval engine – to assist users in finding appropriate moments in videos with two mai...
In the last decade, user-centric video search competitions have facilitated the evolution of interactive video search systems. So far, these competitions focused on a small number of search task categories, with few attempts to change task category configurations. Based on our extensive experience with interactive video search contests, we have ana...
Nowadays, research on lifelog retrieval is attracting increasing attention with a focus on applying machine learning, especially for data annotation/enrichment which is necessary to facilitate effective retrieval. In this paper, we propose two annotation approaches that apply state-of-the-art text/visual and joint embedding technologies for lifelog...
In this paper, we describe a novel approach to the prediction of human blood glucose levels by analysing rich biometric human contextual data from a pioneering lifelog dataset. Numerous prediction models (RF, SVM, XGBoost and Elastic-Net) along with different combinations of input attributes are compared. An efficient ensemble method of stacking of...
The Video Browser Showdown addresses difficult video search challenges through an annual interactive evaluation campaign attracting research teams focusing on interactive video retrieval. The campaign aims to provide insights into the performance of participating interactive video retrieval systems, tested by selected search tasks on large video co...
Letters, diaries, postcards, photo albums, home videos, and lifelogs! These are artefacts of our personal history, they represent how we cherish and preserve memories, re-engage with our past and share our experiences with others. In this demonstration paper, we explore an Virtual Reality (VR) approach to help people reminisce about the past throug...
In this paper, we introduce a multi-user hierarchical video search tool called Videofall. Our objective, in the Video Browser Showdown (VBS) 2022, is to explore if Videofall interactive video retrieval under time constraints is a useful approach to take, given the overhead of requiring multiple users. It is our conjecture that combining different s...
Conventional approaches to image-text retrieval mainly focus on indexing visual objects appearing in pictures but ignore the interactions between these objects. Such objects occurrences and interactions are equivalently useful and important in this field as they are usually mentioned in the text. Scene graph presentation is a suitable method for th...
Identifying stress level can provide valuable data for mental health analytics as well as labels for annotation systems. Although much research has been conducted into stress detection models using heart rate variability at a higher cost of data collection, there is a lack of research on the potential of using low-resolution Electrodermal Activity...
Identifying stress levels can provide valuable data for mental health analytics as well as labels for annotation systems. Although much research has been conducted into stress detection models using heart rate variability at a higher cost of data collection, there is a lack of research on the potential of using low-resolution Electrodermal Activity...
The Lifelog Search Challenge (LSC) is an annual benchmarking challenge for comparing approaches to interactive retrieval from multi-modal lifelogs. LSC'21, the fourth challenge, attracted sixteen participants, each of which had developed interactive retrieval systems for large multimodal lifelogs. These interactive retrieval systems participated in...
Comprehensive and fair performance evaluation of information retrieval systems represents an essential task for the current information age. Whereas Cranfield-based evaluations with benchmark datasets support development of retrieval models, significant evaluation efforts are required also for user-oriented systems that try to boost performance wit...
Conventional approaches to image-text retrieval mainly focus on indexing visual objects appearing in pictures but ignore the interactions between these objects. Such objects occurrences and interactions are equivalently useful and important in this field as they are usually mentioned in the text. Scene graph presentation is a suitable method for th...
Autism specific transition resources (T-Res) aims to develop a flexible resource package to support children and young people with a diagnosis of autism spectrum disorder (ASD), as well as their families and educators, during the loosening and/or lifting of coronavirus disease 2019 (COVID-19) related restrictions on movement. A secondary aim is to...
Understanding the relationship between objects in an image is an important challenge because it can help to describe actions in the image. In this paper, a graphical data structure, named “Scene Graph”, is utilized to represent an encoded informative visual relationship graph for an image, which we suggest has a wide range of potential applications...
The Video Browser Showdown (VBS) is an annual competition in which each participant prepares an interactive video retrieval system and partakes in a live comparative evaluation at the annual MMM Conference. In this paper, we introduce Eolas, which is a prototype video/image retrieval system incorporating a novel virtual reality (VR) interface. For...
Lifelogging can be described as the process by which individuals use various software and hardware devices to gather large archives of multimodal personal data from multiple sources and store them in a personal data archive, called a lifelog. The Lifelog task at NTCIR was a comparative benchmarking exercise with the aim of encouraging research into...
MART (Micro-activity Retrieval Task) was a NTCIR-15 collaborative benchmarking pilot task. The NTCIR-15 MART pilot aimed to motivate the development of irst generation techniques for high-precision micro-activity detection and retrieval, to support the identiication and retrieval of activities that occur over short time-scales such as minutes, rath...
This paper presents an overview of the ImageCLEF 2020 lab that was organized as part of the Conference and Labs of the Evaluation Forum - CLEF Labs 2020. ImageCLEF is an ongoing evaluation initiative (first run in 2003) that promotes the evaluation of technologies for annotation, indexing and retrieval of visual data with the aim of providing infor...
Information retrieval and multimedia content access have a long history of comparative evaluation, and many of the advances in the area over the past decade can be attributed to the availability of open datasets that support comparative and repeatable experimentation. Hence, sharing data and code to allow other researchers to replicate research res...
This paper presents an overview of the ImageCLEF 2020 lab that was organized as part of the Conference and Labs of the Evaluation Forum-CLEF Labs 2020. ImageCLEF is an ongoing evaluation initiative (first run in 2003) that promotes the evaluation of technologies for annotation, indexing and retrieval of visual data with the aim of providing informa...