Nicola Ferro

Nicola Ferro
University of Padova | UNIPD · Department of Information Engineering

Full Professor in Computer Science

About

335
Publications
51,374
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,099
Citations
Introduction
Nicola Ferro currently works at the Department of Information Engineering, University of Padova.

Publications

Publications (335)
Chapter
Amyotrophic Lateral Sclerosis (ALS) is a severe chronic disease characterized by progressive or alternate impairment of neurological functions, characterized by high heterogeneity both in symptoms and disease progression. As a consequence its clinical course is highly uncertain, challenging both patients and clinicians. Indeed, patients have to man...
Article
Full-text available
Query performance prediction (QPP) has been studied extensively in the IR community over the last two decades. A by-product of this research is a methodology to evaluate the effectiveness of QPP techniques. In this paper, we re-examine the existing evaluation methodology commonly used for QPP, and propose a new approach. Our key idea is to model QP...
Article
Full-text available
Feature selection is a common step in many ranking, classification, or prediction tasks and serves many purposes. By removing redundant or noisy features, the accuracy of ranking or classification can be improved and the computational cost of the subsequent learning steps can be reduced. However, feature selection can be itself a computationally ex...
Preprint
Full-text available
Feature selection is a common step in many ranking, classification, or prediction tasks and serves many purposes. By removing redundant or noisy features, the accuracy of ranking or classification can be improved and the computational cost of the subsequent learning steps can be reduced. However, feature selection can be itself a computationally ex...
Chapter
The rapid growth in the number and complexity of conversational agents has highlighted the need for suitable evaluation tools to describe their performance. The main evaluation paradigms move from analyzing conversations where the user explores information needs following a scripted dialogue with the agent. We argue that this is not a realistic set...
Preprint
Full-text available
In this work we introduce repro_eval - a tool for reactive reproducibility studies of system-oriented information retrieval (IR) experiments. The corresponding Python package provides IR researchers with measures for different levels of reproduction when evaluating their systems' outputs. By offering an easily extensible interface, we hope to stimu...
Article
This is a report on the eleventh edition of the Conference and Labs of the Evaluation Forum (CLEF 2021), (virtually) held on September 21--24, 2021, in Bucharest, Romania. CLEF was a four day event combining a Conference and an Evaluation Forum. The Conference featured keynotes by Naila Murray and Mark Sanderson, and presentation of peer reviewed r...
Article
Full-text available
Information Retrieval (IR) is a discipline deeply rooted in evaluation since its inception. Indeed, experimentally measuring and statistically validating the performance of IR systems are the only possible ways to compare systems and understand which are better than others and, ultimately, more effective and useful for end-users. Since the seminal...
Article
Several recent studies have explored the interaction effects between topics, systems, corpora, and components when measuring retrieval effectiveness. However, all of these previous studies assume that a topic or information need is represented by a single query. In reality, users routinely reformulate queries to satisfy an information need. In rece...
Chapter
The ultimate goal of the evaluation is to understand when two IR systems are (significantly) different. To this end, many comparison procedures have been developed over time. However, to date, most reproducibility efforts focused just on reproducing systems and algorithms, almost fully neglecting to investigate the reproducibility of the methods we...
Chapter
In this work we introduce repro_eval - a tool for reactive reproducibility studies of system-oriented Information Retrieval (IR) experiments. The corresponding Python package provides IR researchers with measures for different levels of reproduction when evaluating their systems’ outputs. By offering an easily extensible interface, we hope to stimu...
Conference Paper
Full-text available
We present FullBrain, a social e-learning platform where students share and track their knowledge. FullBrain users can post notes, ask questions and share learning resources in dedicated course and concept spaces. We detail two components of FullBrain: a Social Information Retrieval (SIR) system equipped with query autocomplete and query au-tosugge...
Conference Paper
Full-text available
The ultimate goal of the evaluation is to understand when two IR systems are (significantly) different. To this end, many comparison procedures have been developed over time. However, to date, most re-producibility efforts focused just on reproducing systems and algorithms, almost fully neglecting to investigate the reproducibility of the methods w...
Conference Paper
Full-text available
This is an overview of the NTCIR-15 We Want Web with CENTRE (WWW-3) task. The task features the Chinese subtask (adhoc web search) and the English subtask (adhoc web search, replicability and reproducibility), and received 48 runs from 9 teams. We describe the subtasks, data, evaluation measures, and the official evaluation results.
Conference Paper
Full-text available
In this work we introduce repro eval-a tool for reactive reproducibility studies of system-oriented Information Retrieval (IR) experiments. The corresponding Python package provides IR researchers with measures for different levels of reproduction when evaluating their systems' outputs. By offering an easily extensible interface, we hope to stimula...
Conference Paper
Full-text available
Query Performance Prediction (QPP) has been studied extensively in the IR community over the last two decades. A by-product of this research is a methodology to evaluate the effectiveness of QPP techniques. In this paper, we reexamine the existing evaluation methodology commonly used for QPP, and propose a new approach. Our key idea is to model QPP...
Preprint
Full-text available
Recently, it was shown that most popular IR measures are not interval-scaled, implying that decades of experimental IR research used potentially improper methods, which may have produced questionable results. However, it was unclear if and to what extent these findings apply to actual evaluations and this opened a debate in the community with resea...
Book
This book constitutes the refereed proceedings of the 12th International Conference of the CLEF Association, CLEF 2021, held virtually in September 2021. The conference has a clear focus on experimental information retrieval with special attention to the challenges of multimodality, multilinguality, and interactive search ranging from unstructured...
Article
Full-text available
Learning to Rank (LtR) techniques leverage assessed samples of query-document relevance to learn effective ranking functions able to exploit the noisy signals hidden in the features used to represent queries and documents. In this paper we explore how to enhance the state-of-the-art LambdaMart LtR algorithm by integrating in the training process an...
Conference Paper
Full-text available
Evaluation of the quality of data integration processes is usually performed via manual onerous data inspections. This task is particularly heavy in real business scenarios, where the large amount of data makes checking all the tuples infeasible and the frequent updates, i.e. changes in the sources and/or new sources, impose to repeat the evaluatio...
Article
Full-text available
This is a report on the tenth edition of the \textsl{Conference and Labs of the Evaluation Forum} (CLEF 2020), (virtually) held from September 22--25, 2020, in Thessaloniki, Greece. CLEF was a four day event combining a Conference and an Evaluation Forum. The Conference featured keynotes by Ellen Voorhees and Yiannis Kompasiaris, and presentation...
Preprint
Full-text available
Replicability and reproducibility of experimental results are primary concerns in all the areas of science and IR is not an exception. Besides the problem of moving the field towards more reproducible experimental practices and protocols, we also face a severe methodological issue: we do not have any means to assess when reproduced is reproduced. M...
Chapter
Ground-truth creation is one of the most demanding activities in terms of time, effort, and resources needed for creating an experimental collection. For this reason, crowdsourcing has emerged as a viable option to reduce the costs and time invested in it. An effective assessor merging methodology is crucial to guarantee a good ground-truth quality...
Book
This book constitutes the refereed proceedings of the 11th International Conference of the CLEF Association, CLEF 2020, held in Thessaloniki, Greece, in September 2020.* The conference has a clear focus on experimental information retrieval with special attention to the challenges of multimodality, multilinguality, and interactive search ranging fr...
Conference Paper
Full-text available
Evaluation measures are more or less explicitly based on user models which abstract how users interact with a ranked result list and how they accumulate utility from it. However, traditional measures typically come with a hard-coded user model which can be, at best, parametrized. Moreover, they take a deterministic approach which leads to assign a...
Conference Paper
Full-text available
Replicability and reproducibility of experimental results are primary concerns in all the areas of science and IR is not an exception. Besides the problem of moving the field towards more reproducible experimental practices and protocols, we also face a severe method-ological issue: we do not have any means to assess when reproduced is reproduced....
Conference Paper
Full-text available
Ground-truth creation is one of the most demanding activities in terms of time, effort, and resources needed for creating an experimental collection. For this reason, crowdsourcing has emerged as a viable option to reduce the costs and time invested in it. An effective assessor merging methodology is crucial to guarantee a good ground-truth quality...
Conference Paper
Full-text available
Learning to Rank (LtR) techniques leverage assessed samples of query-document relevance to learn ranking functions able to exploit the noisy signals hidden in the features used to represent queries and documents. In this paper, we explore how to enhance the state-of-the-art LambdaMart algorithm by integrating in the training process an explicit kno...
Conference Paper
Full-text available
The CLEF-NTCIR-TREC Reproducibility track (CENTRE) is a research replication and reproduction effort spanning three major information retrieval evaluation venues. In the TREC edition, CENTRE participants were asked to reproduce runs from either the TREC 2016 clinical decision support track, the 2013 web track, or the 2014 web track. Only one group...
Article
Full-text available
Evaluation measures are the basis for quantifying the performance of IR systems and the way in which their values can be processed to perform statistical analyses depends on the scales on which these measures are defined. For example, mean and variance should be computed only when relying on interval scales. In our previous work we defined a theory...
Chapter
In this work we describe how Docker images can be used to enhance the reproducibility of Neural IR models. We report our results reproducing the Vector Space Neural Model (NVSM) and we release a CPU-based and a GPU-based Docker image. Finally, we present some insights about reproducing Neural IR models.
Book
This two-volume set LNCS 12035 and 12036 constitutes the refereed proceedings of the 42nd European Conference on IR Research, ECIR 2020, held in Lisbon, Portugal, in April 2020. The 55 full papers presented together with 8 reproducibility papers, 46 short papers, 10 demonstration papers, 12 invited CLEF papers, 7 doctoral consortium papers, 4 works...
Book
This two-volume set LNCS 12035 and 12036 constitutes the refereed proceedings of the 42nd European Conference on IR Research, ECIR 2020, held in Lisbon, Portugal, in April 2020. The 55 full papers presented together with 8 reproducibility papers, 46 short papers, 10 demonstration papers, 12 invited CLEF papers, 7 doctoral consortium papers, 4 works...
Article
Full-text available
This is a report on the tenth edition of the Conference and Labs of the Evaluation Forum (CLEF 2019), held from September 9--12, 2019, in Lugano, Switzerland. CLEF was a four day event combining a Conference and an Evaluation Forum. The Conference featured keynotes by Bruce Croft, Yair Neuman, and Miguel Martinez, and presentation of peer reviewe...
Cover Page
Full-text available
This paper reports on the 12th edition of the European Summer School in Information Retrieval (ESSIR 2019), held in Milan, Italy, from 15 to 19 July 2019.
Chapter
We investigate the application of Visual Analytics (VA) techniques to the exploration and interpretation of Information Retrieval (IR) experimental data. We first briefly introduce the main concepts about VA and then we present some relevant examples of VA prototypes developed for better investigating IR evaluation data. Finally, we conclude with a...
Chapter
This paper describes the steps that led to the invention, design and development of the Distributed Information Retrieval Evaluation Campaign Tool (DIRECT) system for managing and accessing the data used and produced within experimental evaluation in Information Retrieval (IR). We present the context in which DIRECT was conceived, its conceptual mo...
Chapter
This introductory chapter begins by explaining briefly what is intended by experimental evaluation in information retrieval in order to provide the necessary background for the rest of this volume. The major international evaluation initiatives that have adopted and implemented in various ways this common framework are then presented and their rela...
Conference Paper
Full-text available
Reproducibility has become increasingly important for many research areas, among those IR is not an exception and has started to be concerned with reproducibility and its impact on research results. This paper describes our second attempt to propose a lab on reproducibility named CENTRE, held during CLEF 2019. The aim of CENTRE is to run both a rep...
Chapter
Full-text available
Reproducibility has become increasingly important for many research areas, among those IR is not an exception and has started to be concerned with reproducibility and its impact on research results. This paper describes our second attempt to propose a lab on reproducibility named CENTRE, held during CLEF 2019. The aim of CENTRE is to run both a rep...
Chapter
2019 marks the 20\(^\text {th}\) birthday for CLEF, an evaluation campaign activity which has applied the Cranfield evaluation paradigm to the testing of multilingual and multimodal information access systems in Europe. This paper provides a summary of the motivations which led to the establishment of CLEF, and a description of how it has evolved o...
Conference Paper
Full-text available
In this work, we propose a Docker image architecture for the replica-bility of Neural IR (NeuIR) models. We also share two self-contained Docker images to run the Neural Vector Space Model (NVSM) [22], an unsupervised NeuIR model. The first image we share (nvsm_cpu) can run on most machines and relies only on CPU to perform the required computation...
Conference Paper
Full-text available
The Open-Source IR Replicability Challenge (OSIRRC 2019), organized as a workshop at SIGIR 2019, aims to improve the replicability of ad hoc retrieval experiments in information retrieval by gathering a community of researchers to jointly develop a common Docker specification and build Docker images that encapsulate a diversity of systems and retri...
Conference Paper
Full-text available
We improve the measurement accuracy of retrieval system performance by better modeling the noise present in test collection scores. Our technique draws its inspiration from two approaches: one, which exploits the variable measurement accuracy of topics; the other, which randomly splits document collections into shards. We describe and theoretically...
Conference Paper
Full-text available
CENTRE is the first-ever metatask that operates across the three major information retrieval evaluation venues: CLEF, NTCIR, and TREC. The task had three subtasks: T1 (Replicability), T2TREC (Re-producibility), and T2OPEN (Reproducibility). The T1 subtask examined whether a particular pair of runs from the NTCIR-13 WWW-1 task can be replicated (on...
Conference Paper
Full-text available
2019 marks the 20th birthday for CLEF, an evaluation campaign activity which has applied the Cranfield evaluation paradigm to the testing of multilingual and multimodal information access systems in Europe. This paper provides a summary of the motivations which led to the establishment of CLEF, and a description of how it has evolved over the years...
Conference Paper
Full-text available
Reproducibility has become increasingly important for many research areas, among those IR is not an exception and has started to be concerned with reproducibility and its impact on research results. This paper describes our second attempt to propose a lab on reproducibility named CENTRE, held during CLEF 2019. The aim of CENTRE is to run both a rep...
Conference Paper
Full-text available
The importance of repeatability, replicability, and reproducibility is broadly recognized in the computational sciences, both in supporting desirable scientific methodology as well as sustaining empirical progress. This workshop tackles the replicability challenge for ad hoc document retrieval, via a common Docker interface specification to support...
Conference Paper
Full-text available
Evaluation measures are the basis for quantifying the performance of information access systems and the way in which their values can be processed to perform statistical analyses depends on the scales on which these measures are defined. For example, mean and variance should be computed only when relying on interval scales. We define a formal theor...
Conference Paper
Full-text available
We improve the measurement accuracy of retrieval system performance by better modeling the noise present in test collection scores. Our technique draws its inspiration from two approaches: one, which exploits the variable measurement accuracy of topics; the other, which randomly splits document collections into shards. We describe and theoretically...
Chapter
Full-text available
It has been recently proposed to consider relevance assessment as a stochastic process where relevance judgements are modeled as binomial random variables and, consequently, evaluation measures become random evaluation measures, removing the distinction between binary and multi-graded evaluation measures.
Chapter
Full-text available
Reproducibility of experimental results has recently become a primary issue in the scientific community at large, and in the information retrieval community as well, where initiatives and incentives to promote and ease reproducibility are arising. In this context, CENTRE is a joint CLEF/TREC/NTCIR lab which aims at raising the attention on this top...
Chapter
Full-text available
We investigate a new approach for evaluating session-based information retrieval systems, based on Markov chains. In particular, we develop a new family of evaluation measures, inspired by random walks, which account for the probability of moving to the next and previous documents in a result list, to the next query in a session, and to the end of...
Article
Full-text available
Despite the bulk of research studying how to more accurately compare the performance of IR systems, less attention is devoted to better understanding the different factors which play a role in such performance and how they interact. This is the case of shards, i.e. partitioning a document collection into sub-parts, which are used for many different...
Conference Paper
Full-text available
It has been recently proposed to consider relevance assessment as a stochastic process where relevance judgements are modeled as binomial random variables and, consequently, evaluation measures become random evaluation measures, removing the distinction between binary and multi-graded evaluation measures. In this paper, we adopt this stochastic vie...
Conference Paper
Full-text available
We investigate a new approach for evaluating session-based information retrieval systems, based on Markov chains. In particular, we develop a new family of evaluation measures, inspired by random walks, which account for the probability of moving to the next and previous documents in a result list, to the next query in a session, and to the end of...
Conference Paper
Full-text available
Reproducibility of experimental results has recently become a primary issue in the scientific community at large, and in the information retrieval community as well, where initiatives and incentives to promote and ease reproducibility are arising. In this context, CENTRE is a joint CLEF/TREC/NTCIR lab which aims at raising the attention on this top...
Article
Full-text available
This is a report on the first edition of the International Workshop on Generalization in Information Retrieval (GLARE 2018), co-located with the 27th ACM International Conference on Information and Knowledge Management (CIKM 2018) held in Turin, Italy, on October 22, 2018.
Book
This volume celebrates the twentieth anniversary of CLEF - the Cross-Language Evaluation Forum for the first ten years, and the Conference and Labs of the Evaluation Forum since – and traces its evolution over these first two decades. CLEF’s main mission is to promote research, innovation and development of information retrieval (IR) systems by ant...
Book
This book constitutes the refereed proceedings of the 10th International Conference of the CLEF Association, CLEF 2019, held in Lugano, Switzerland, in September 2019. The conference has a clear focus on experimental information retrieval with special attention to the challenges of multimodality, multilinguality, and interactive search ranging from...
Article
Full-text available
This is a report on the ninth edition of the \textsl{Conference and Labs of the Evaluation Forum} (CLEF 2018), held in early September 2018, in Avignon, France. CLEF was a four day event combining a Conference and an Evaluation Forum. The Conference featured keynotes by Nicholas Belkin, Julio Gonzalo, and Gabriella Pasi, and presentation of 29 pee...
Article
Full-text available
We describe the state-of-the-art in performance modeling and prediction for Information Retrieval (IR), Natural Language Processing (NLP) and Recommender Systems (RecSys) along with its shortcomings and strengths. We present a framework for further research, identifying five major problem areas: understanding measures, performance analysis, making...