Gareth J F Jones

Gareth J F Jones
Dublin City University | DCU · Adapt Centre / School of Computing

B.Eng., PhD

About

487
Publications
46,163
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,989
Citations
Additional affiliations
June 2003 - present
Dublin City University
Position
  • Professor (Associate)
October 1997 - September 1998
Toshiba Corporation
Position
  • Toshiba Fellow
September 1996 - May 2003
University of Exeter
Position
  • Lecturer

Publications

Publications (487)
Preprint
Full-text available
Evaluation of open-domain dialogue systems is highly challenging and development of better techniques is highlighted time and again as desperately needed. Despite substantial efforts to carry out reliable live evaluation of systems in recent competitions, annotations have been abandoned and reported as too unreliable to yield sensible results. This...
Article
Full-text available
An automated contextual suggestion algorithm is likely to recommend contextually appropriate and personalized ‘points-of-interest’ (POIs) to a user, if it can extract information from the user’s preference history (exploitation) and effectively blend it with the user’s current contextual information (exploration) to predict a POI’s ‘appropriateness...
Chapter
We present an investigation into the use of semi-supervised training and content genre adaptation for improved automatic speech recognition (ASR) of diverse user-generated videos in the task of spoken content retrieval (SCR). Previous work has successfully applied semi-supervised training in single domain ASR tasks. Our focus is on the exploration...
Preprint
Full-text available
Rapidly growing online podcast archives contain diverse content on a wide range of topics. These archives form an important resource for entertainment and professional use, but their value can only be realized if users can rapidly and reliably locate content of interest. Search for relevant content can be based on metadata provided by content creat...
Article
Full-text available
Reducing user effort in finding relevant information is one of the key objectives of search systems. Existing approaches have been shown to effectively exploit the current search session context of users for automatically suggesting queries to reduce their search efforts. However, these approaches do not accomplish the end-goal of a search system-t...
Article
The Third Workshop on Evaluation of Personalisation in Information Retrieval (WEPIR 2021) was held in conjunction with the ACM SIGIR Conference on Human Information Interaction & Retrieval (CHIIR 2021) in Canberra, Australia, as a virtual event. WEPIR 2021 followed on from the first and second WEPIRs held at CHIIR 2018 and 2019. The purpose of the...
Article
This report describes the workshop on Supporting and Understanding of (multi-party) conversational Dialogues (SUD) organized as a part of the Web Search and Data Mining conference (WSDM) 2021. The aim of SUD workshop was to encourage researchers to investigate automated methods to analyze and understand conversations. We also discuss the release of...
Conference Paper
Full-text available
To facilitate effective translation modeling and translation studies, one of the crucial questions to address is how to assess translation quality. From the perspectives of accuracy, reliability, repeatability, and cost, translation quality assessment (TQA) itself is a rich and challenging task. In this work, we present a high-level and concise sur...
Preprint
Full-text available
To facilitate effective translation modeling and translation studies, one of the crucial questions to address is how to assess translation quality. From the perspectives of accuracy, reliability, repeatability and cost, translation quality assessment (TQA) itself is a rich and challenging task. In this work, we present a high-level and concise surv...
Preprint
Full-text available
The TREC Video Retrieval Evaluation (TRECVID) is a TREC-style video analysis and retrieval evaluation with the goal of promoting progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation. Over the last twenty years this effort has yielded a better understand...
Preprint
Full-text available
Chinese character decomposition has been used as a feature to enhance Machine Translation (MT) models, combining radicals into character and word level models. Recent work has investigated ideograph or stroke level embedding. However, questions remain about different decomposition levels of Chinese character representations, radical and strokes, be...
Preprint
Full-text available
Conversational search presents opportunities to support users in their search activities to improve the effectiveness and efficiency of search while reducing their cognitive load. Limitations of the potential competency of conversational agents restrict the situations for which conversational search agents can replace human intermediaries. It is th...
Preprint
Full-text available
Conversational search (CS) has recently become a significant focus of the information retrieval (IR) research community. Multiple studies have been conducted which explore the concept of conversational search. Understanding and advancing research in CS requires careful and detailed evaluation. Existing CS studies have been limited to evaluation bas...
Preprint
The Podcast Track is new at the Text Retrieval Conference (TREC) in 2020. The podcast track was designed to encourage research into podcasts in the information retrieval and NLP research communities. The track consisted of two shared tasks: segment retrieval and summarization, both based on a dataset of over 100,000 podcast episodes (metadata, audi...
Chapter
We present a demonstration application for dialogue-based search. In this system, a conversational agent engages with the user of an online search tool to support their search activities. Agent-supported conversational search of this type represents a fundamental advance beyond current standard search engines, such as web search tools. Analogous to...
Preprint
Full-text available
In this work, we present the construction of multilingual parallel corpora with annotation of multiword expressions (MWEs). MWEs include verbal MWEs (vMWEs) defined in the PARSEME shared task that have a verb as the head of the studied terms. The annotated vMWEs are also bilingually and multilingually aligned manually. The languages covered include...
Conference Paper
Full-text available
In this work, we present the construction of multilingual parallel corpora with annotation of multiword expressions (MWEs). MWEs include verbal MWEs (vMWEs) defined in the PARSEME shared task that have a verb as the head of the studied terms. The annotated vMWEs are also bilingually and multilingually aligned manually. The languages covered include...
Preprint
Full-text available
An automated contextual suggestion algorithm is likely to recommend contextually appropriate and personalized 'points-of-interest' (POIs) to a user, if it can extract information from the user's preference history (exploitation) and effectively blend it with the user's current contextual information (exploration) to predict a POI's 'appropriateness...
Preprint
Multi-word expressions (MWEs) are a hot topic in research in natural language processing (NLP), including topics such as MWE detection, MWE decomposition, and research investigating the exploitation of MWEs in other NLP fields such as Machine Translation. However, the availability of bilingual or multi-lingual MWE corpora is very limited. The only...
Conference Paper
Full-text available
Multi-word expressions (MWEs) are a hot topic in research in natural language processing (NLP), including topics such as MWE detection, MWE decomposition, and research investigating the exploitation of MWEs in other NLP fields such as Machine Translation. However, the availability of bilingual or multilingual MWE corpora is very limited. The only b...
Chapter
We describe our participation in the NTCIR-14 OpenLiveQ-2 task and our post-submission investigations. For a given query and a set of questions with their answers, participants in the OpenLiveQ task were required to return a ranked list of questions that potentially match and satisfy the user’s query effectively. In this paper we focus on two main...
Chapter
The Personalised Information Retrieval Lab (PIR-CLEF 2019) lab is an initiative aimed at both providing and critically analysing the evaluation of Personalization in Information Retrieval (PIR) applications. PIR-CLEF 2019 is the second edition of the Lab after the successful Pilot lab organised at CLEF 2017 and the first edition of the Lab at CLEF...
Chapter
We present a method for automatically summarising audio-visual recordings of academic presentations. For generation of presentation summaries, keywords are taken from automatically created transcripts of the spoken content. These are then augmented by incorporating classification output scores for speaker ratings, audience engagement, emphasised sp...
Preprint
A vast amount of audio-visual data is available on the Internet thanks to video streaming services, to which users upload their content. However, there are difficulties in exploiting available data for supervised statistical models due to the lack of labels. Unfortunately, generating labels for such amount of data through human annotation can be ex...
Article
The Second Workshop on Evaluation of Personalisation in Information Retrieval (WEPIR 2019) was held in conjunction with the ACM SIGIR Conference on Human Information Interaction & Retrieval (CHIIR 2019) in Glasgow, Scotland. WEPIR 2019 followed on from the first WEPIR held at CHIIR 2018. The purpose of the workshop was again to bring together resea...
Article
Full-text available
The study of query performance prediction (QPP) in information retrieval (IR) aims to predict retrieval effectiveness. The specificity of the underlying information need of a query often determines how effectively can a search engine retrieve relevant documents at top ranks. The presence of ambiguous terms makes a query less specific to the sought...
Conference Paper
The second WEPIR 2019 workshop brings together researchers with different backgrounds interested in continuing to explore and advance the evaluation of personalisation in information retrieval. The workshop builds on the first WEPIR workshop held at CHIIR 2018, and will focus on further developing a common understanding of the challenges, requireme...
Article
Full-text available
The Second Workshop on Exploitation of Social Media for Emergency Relief and Preparedness (SMERP) was held in conjunction with The Web Conference (WWW) 2018 at Lyon, France. A primary aim of the workshop was to promote multi-modal and multi-view information retrieval from the social media content in disaster situations. The workshop programme inclu...
Article
Full-text available
Since its inception in 2013, one of the key contributions of the CLEF eHealth evaluation campaign has been the organization of an ad-hoc information retrieval (IR) benchmarking task. This IR task evaluates systems intended to support laypeople searching for and understanding health information. Each year the task provides registered participants wi...
Article
Full-text available
An information retrieval (IR) system can often fail to retrieve relevant documents due to the incomplete specification of information need in the user’s query. Pseudo-relevance feedback (PRF) aims to improve IR effectiveness by exploiting potentially relevant aspects of the information need present in the documents retrieved in an initial search. S...
Article
Full-text available
Twitter (http://twitter.com) is one of the most popular social networking platforms. Twitter users can easily broadcast disaster-specific information, which, if effectively mined, can assist in relief operations. However, the brevity and informal nature of tweets pose a challenge to Information Retrieval (IR) researchers. In this paper, we successf...
Article
The Workshop on Evaluation of Personalisation in Information Retrieval (WEPIR 2018) was held in conjunction with the ACM SIGIR Conference on Human Information Interaction & Retrieval (CHIIR 2018) in New Brunswick, USA. The purpose of WEPIR 2018 was to bring together researchers from different backgrounds, interested in advancing the evaluation of p...
Article
Full-text available
Online Social Media, such as Twitter, Facebook and WhatsApp, are important sources of real-time information related to emergency events, including both natural calamities, man-made disasters, epidemics, and so on. There has been lot of recent work on designing information systems that would be useful for aiding post-disaster relief operations, as w...
Article
Full-text available
The purpose of the Strategic Workshop in Information Retrieval in Lorne is to explore the long-range issues of the Information Retrieval field, to recognize challenges that are on-or even over-the horizon, to build consensus on some of the key challenges, and to disseminate the resulting information to the research community. The intent is that thi...
Conference Paper
Users of current search systems actively interact with the system to complete their search task. This can encompass formulating and reformulating a series queries expressing evolving of different information needs. We believe that the next generation of search systems will see a shift towards proactive understanding of user intent based on analysis...
Conference Paper
Conversational search presents opportunities to support users in their search activities to improve the effectiveness and efficiency of search while reducing their cognitive load. Limitations of the potential competency of conversational agents restrict the situations for which conversational search agents can replace human intermediaries. It is th...
Conference Paper
Full-text available
Expanding online archives of presentation recordings provide potentially valuable resources for learning and research. However, the huge volume of data that is becoming available means that users have difficulty locating material which will be of most value to them. Conventional summarisation methods making use of text-based features derived from t...
Conference Paper
Full-text available
User-generated content on online social media (OSM) platforms has become an important source of real-time information during emergency events. The SMERP workshop series aims to provide a forum for researchers working on utilizing OSM for emergency preparedness and aiding post-emergency relief operations. The workshop aims to bring together research...
Conference Paper
The purpose of the WEPIR 2018 workshop is to bring together researchers from different backgrounds, interested in advancing the evaluation of personalisation in information retrieval. The workshop focus is on the development of a common understanding of the challenges, requirements and practical limitations of meaningful evaluation of personalisati...
Article
Full-text available
Automatic detection of source code plagiarism is an important research field for both the commercial software industry and within the research community. Existing methods of plagiarism detection primarily involve exhaustive pairwise document comparison, which does not scale well for large software collections. To achieve scalability, we approach th...
Conference Paper
Full-text available
We present a novel method for the automatic generation of video summaries of academic presentations using linguistic and paralinguistic features. Our investigation is based on a corpus of academic conference presentations. Summaries are first generated based on keywords taken from transcripts created using automatic speech recognition (ASR). We aug...
Article
Full-text available
This is a report on the eighth edition of the Conference and Labs of the Evaluation Forum (CLEF 2017), held in early September 2017, in Dublin, Ireland. CLEF was a four day event combining a Conference and an Evaluation Forum. The Conference featured keynotes by Leif Azzopardi and Vincent Wade, and presentation of 32 peer reviewed research papers c...
Conference Paper
Full-text available
Video-to-video linking systems allow users to explore and exploit the content of a large-scale multimedia collection interactively and without the need to formulate specific queries. We present a short introduction to video-to-video linking (also called ‘video hyperlinking’), and describe the latest edition of the Video Hyperlinking (LNK) task at T...
Conference Paper
Full-text available
Rapidly expanding archives of audiovisual recordings available online are making unprecedented amounts of information available in many applications. New and efficient techniques to access this information are needed to fully realise the potential of these archives. We investigate the identification of areas of intentional or unintentional emphasis...
Conference Paper
The Personalised Information Retrieval Pilot Lab (PIR-CLEF 2017) provides a forum for the exploration of evaluation of personalised approaches to information retrieval (PIR). The Pilot Lab provides a preliminary edition of a Lab task dedicated to personalised search. The PIR-CLEF 2017 Pilot Task is the first evaluation benchmark based on the Cranfi...
Chapter
The high-variability in content and structure combined with transcription errors makes effective information retrieval (IR) from archives of spoken user generated content (UGC) very challenging. Previous research has shown that using passage-level evidence for query expansion (QE) in IR can be beneficial for improving search effectiveness. Our inve...
Article
Full-text available
The first international workshop on Exploitation of Social Media for Emergency Relief and Preparedness (SMERP) was held in conjunction with the 2017 European Conference on Information Retrieval (ECIR) in Aberdeen, Scotland, UK. The aim of the workshop was to explore various technologies for extracting useful information from social media content in...
Conference Paper
Full-text available
We present a novel method for the generation of automatic video summaries of academic presentations. We base our investigation on a corpus of multimodal academic conference presentations combining transcripts with paralinguistic multimodal features. We first generate summaries based on keywords by using transcripts created using automatic speech re...
Article
Full-text available
Cross Language Information Retrieval (CLIR) systems are a valuable tool to enable speakers of one language to search for content of interest expressed in a different language. A group for whom this is of particular interest is bilingual Arabic speakers who wish to search for English language content using information needs expressed in Arabic queri...
Conference Paper
We describe an initial study into the identification of important and useful information units within documents retrieved by an information retrieval system in response to a user query created in response to an underlying information need. This study is part of a large investigation of the exploitation of useful and important units from retrieved d...
Article
The Benchmarking Initiative for Multimedia Evaluation (MediaEval) organizes an annual cycle of scientific evaluation tasks in the area of multimedia access and retrieval. The tasks offer scientific challenges to researchers working in diverse areas of multimedia technology. The tasks, which are focused on the social and human aspects of multimedia,...