Hamed Zamani

Hamed Zamani
University of Massachusetts Amherst | UMass Amherst · School of Computer Science

About

110
Publications
11,395
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,474
Citations

Publications

Publications (110)
Preprint
This paper studies multi-task training of retrieval-augmented generation models for knowledge-intensive tasks. We propose to clean the training set by utilizing a distinct property of knowledge-intensive generation: The connection of query-answer pairs to items in the knowledge base. We filter training examples via a threshold of confidence on the...
Preprint
Recently, several dense retrieval (DR) models have demonstrated competitive performance to term-based retrieval that are ubiquitous in search systems. In contrast to term-based matching, DR projects queries and documents into a dense vector space and retrieves results via (approximate) nearest neighbor search. Deploying a new system, such as DR, in...
Preprint
Full-text available
Asking clarification questions is an active area of research; however, resources for training and evaluating search clarification methods are not sufficient. To address this issue, we describe MIMICS-Duo, a new freely available dataset of 306 search queries with multiple clarifications (a total of 1,034 query-clarification pairs). MIMICS-Duo contai...
Preprint
Full-text available
Although information access systems have long supported people in accomplishing a wide range of tasks, we propose broadening the scope of users of information access systems to include task-driven machines, such as machine learning models. In this way, the core principles of indexing, representation, retrieval, and ranking can be applied and extend...
Preprint
Full-text available
Recent work has shown that more effective dense retrieval models can be obtained by distilling ranking knowledge from an existing base re-ranking model. In this paper, we propose a generic curriculum learning based optimization framework called CL-DRD that controls the difficulty level of training data produced by the re-ranking (teacher) model. CL...
Preprint
Conversational information seeking (CIS) is concerned with a sequence of interactions between one or more users and an information system. Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures. This monograph provides a thorough overview of C...
Preprint
Full-text available
At the foundation of scientific evaluation is the labor-intensive process of peer review. This critical task requires participants to consume and interpret vast amounts of highly technical text. We show that discourse cues from rebuttals can shed light on the quality and interpretation of reviews. Further, an understanding of the argumentative stra...
Preprint
Full-text available
Information seeking conversations between users and Conversational Search Agents (CSAs) consist of multiple turns of interaction. While users initiate a search session, ideally a CSA should sometimes take the lead in the conversation by obtaining feedback from the user by offering query suggestions or asking for query clarifications i.e. mixed init...
Article
Full-text available
This research analyzes human‐generated clarification questions to provide insights into how they are used to disambiguate and provide a better understanding of information needs. A set of clarification questions is extracted from posts on the Stack Exchange platform. Novel taxonomy is defined for the annotation of the questions and their responses....
Article
Users install many apps on their smartphones, raising issues related to information overload for users and resource management for devices. Moreover, the recent increase in the use of personal assistants has made mobile devices even more pervasive in users’ lives. This article addresses two research problems that are vital for developing effective...
Conference Paper
Full-text available
Recent research on conversational information seeking mostly focuses on uni-modal interactions and information items. In this perspective paper, we highlight the importance of moving towards developing and evaluating multi-modal conversational information seeking (MMCIS) systems as they enable us to leverage richer context, overcome errors, and inc...
Article
Full-text available
One common characteristic of research works focused on fairness evaluation (in machine learning) is that they call for some form of parity (equality) either in treatment – meaning they ignore the information about users’ memberships in protected classes during training – or in impact – by enforcing proportional beneficial outcomes to users in diffe...
Preprint
Full-text available
Podcasts are spoken documents across a wide-range of genres and styles, with growing listenership across the world, and a rapidly lowering barrier to entry for both listeners and creators. The great strides in search and recommendation in research and industry have yet to see impact in the podcast space, where recommendations are still largely driv...
Preprint
An emerging recipe for achieving state-of-the-art effectiveness in neural document re-ranking involves utilizing large pre-trained language models - e.g., BERT - to evaluate all individual passages in the document and then aggregating the outputs by pooling or additional Transformer layers. A major drawback of this approach is high query latency du...
Preprint
In this work, we address multi-modal information needs that contain text questions and images by focusing on passage retrieval for outside-knowledge visual question answering. This task requires access to outside knowledge, which in our case we define to be a large unstructured passage collection. We first conduct sparse retrieval with BM25 and stu...
Preprint
The Transformer-Kernel (TK) model has demonstrated strong reranking performance on the TREC Deep Learning benchmark -- and can be considered to be an efficient (but slightly less effective) alternative to other Transformer-based architectures that employ (i) large-scale pretraining (high training cost), (ii) joint encoding of query and document (hi...
Preprint
While current information retrieval systems are effective for known-item retrieval where the searcher provides a precise name or identifier for the item being sought, systems tend to be much less effective for cases where the searcher is unable to express a precise name or identifier. We refer to this as tip of the tongue (TOT) known-item retrieval...
Preprint
Full-text available
Users install many apps on their smartphones, raising issues related to information overload for users and resource management for devices. Moreover, the recent increase in the use of personal assistants has made mobile devices even more pervasive in users' lives. This paper addresses two research problems that are vital for developing effective pe...
Preprint
We benchmark Conformer-Kernel models under the strict blind evaluation setting of the TREC 2020 Deep Learning track. In particular, we study the impact of incorporating: (i) Explicit term matching to complement matching based on learned representations (i.e., the "Duet principle"), (ii) query term independence (i.e., the "QTI assumption") to scale...
Preprint
The Transformer-Kernel (TK) model has demonstrated strong reranking performance on the TREC Deep Learning benchmark---and can be considered to be an efficient (but slightly less effective) alternative to BERT-based ranking models. In this work, we extend the TK architecture to the full retrieval setting by incorporating the query term independence...
Preprint
Search clarification has recently attracted much attention due to its applications in search engines. It has also been recognized as a major component in conversational information seeking systems. Despite its importance, the research community still feels the lack of a large-scale data for studying different aspects of search clarification. In thi...
Preprint
Asking clarifying questions in response to ambiguous or faceted queries has been recognized as a useful technique for various information retrieval systems, especially conversational search systems with limited bandwidth interfaces. Analyzing and generating clarifying questions have been studied recently but the accurate utilization of user respons...
Preprint
Asking clarifying questions in response to search queries has been recognized as a useful technique for revealing the underlying intent of the query. Clarification has applications in retrieval systems with different interfaces, from the traditional web search interfaces to the limited bandwidth interfaces as in speech-only and small screen devices...
Preprint
Neural networks, particularly Transformer-based architectures, have achieved significant performance improvements on several retrieval benchmarks. When the items being retrieved are documents, the time and memory cost of employing Transformers over a full sequence of document terms can be prohibitive. A popular strategy involves considering only th...
Article
The rapid growth in speech and small screen interfaces, particularly on mobile devices, has significantly influenced the way users interact with intelligent systems to satisfy their information needs. The growing interest in personal digital assistants, such as Amazon Alexa, Apple Siri, Google Assistant, and Microsoft Cortana, demonstrates the will...
Chapter
Full-text available
Considering the widespread use of mobile and voice search, answer passage retrieval for non-factoid questions plays a critical role in modern information retrieval systems. Despite the importance of the task, the community still feels the significant lack of large-scale non-factoid question answering collections with real questions and comprehensiv...
Preprint
Full-text available
This paper discusses the potential for creating academic resources (tools, data, and evaluation approaches) to support research in conversational search, by focusing on realistic information needs and conversational interactions. Specifically, we propose to develop and operate a prototype conversational search system for scholarly activities. This...
Preprint
Conversational information seeking (CIS) has been recognized as a major emerging research area in information retrieval. Such research will require data and tools, to allow the implementation and study of conversational systems. This paper introduces Macaw, an open-source framework with a modular architecture for CIS research. Macaw supports multi-...
Conference Paper
Estimating the quality of a result list, often referred to as query performance prediction (QPP), is a challenging and important task in information retrieval. It can be used as feedback to users, search engines, and system administrators. Although predicting the performance of retrieval models has been extensively studied for the ad-hoc retrieval...
Preprint
Full-text available
Fairness in recommender systems has been considered with respect to sensitive attributes of users (e.g., gender, race) or items (e.g., revenue in a multistakeholder setting). Regardless, the concept has been commonly interpreted as some form of equality– i.e., the degree to which the system is meeting the information needs of all its users in an eq...
Conference Paper
Full-text available
Fairness in recommender systems has been considered with respect to sensitive attributes of users (e.g., gender, race) or items (e.g., revenue in a multistakeholder setting). Regardless, the concept has been commonly interpreted as some form of equality-i.e., the degree to which the system is meeting the information needs of all its users in an equ...
Article
The ACM Recommender Systems Challenge 2018 focused on the task of automatic music playlist continuation, which is a form of the more general task of sequential recommendation. Given a playlist of arbitrary length with some additional meta-data, the task was to recommend up to 500 tracks that fit the target characteristics of the original playlist....
Preprint
Full-text available
Multi-hop question answering (QA) requires an information retrieval (IR) system that can find \emph{multiple} supporting evidence needed to answer the question, making the retrieval process very challenging. This paper introduces an IR technique that uses information of entities present in the initially retrieved evidence to learn to `\emph{hop}' t...
Preprint
Full-text available
Fairness in recommender systems has been considered with respect to sensitive attributes of users (e.g., gender, race) or items (e.g., revenue in a multistakeholder setting). Regardless, the concept has been commonly interpreted as some form of equality -- i.e., the degree to which the system is meeting the information needs of all its users in an...
Conference Paper
Full-text available
Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience. Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Aski...
Conference Paper
With widespread use of mobile devices, instant messaging (IM) services have recently attracted a great deal of attention by millions of users. This has motivated news agencies to share their contents via such platforms in addition to their websites and popular social media. As a result, thousands of users nowadays follow the news agencies through t...
Preprint
Full-text available
Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience. Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Aski...
Article
Ranking models lie at the heart of research on information retrieval (IR). During the past decades, different techniques have been proposed for constructing ranking models, from traditional heuristic methods, probabilistic methods, to modern machine learning methods. Recently, with the advance of deep learning technology, we have witnessed a growin...
Preprint
Full-text available
Considering the widespread use of mobile and voice search, answer passage retrieval for non-factoid questions plays a critical role in modern information retrieval systems. Despite the importance of the task, the community still feels the significant lack of large-scale non-factoid question answering collections with real questions and comprehensiv...
Preprint
The bidirectional encoder representations from transformers (BERT) model has recently advanced the state-of-the-art in passage re-ranking. In this paper, we analyze the results produced by a fine-tuned BERT model to better understand the reasons behind such substantial improvements. To this aim, we focus on the MS MARCO passage re-ranking dataset a...
Preprint
Ranking models lie at the heart of research on information retrieval (IR). During the past decades, different techniques have been proposed for constructing ranking models, from traditional heuristic methods, probabilistic methods, to modern machine learning methods. Recently, with the advance of deep learning technology, we have witnessed a growin...
Preprint
Full-text available
With the recent growth in the use of conversational systems and intelligent assistants such as Google Assistant and Microsoft Cortana, mobile devices are becoming even more pervasive in our lives. As a consequence, users are getting engaged with mobile apps and frequently search for an information need using different apps. Recent work has stated t...
Conference Paper
The availability of massive data and computing power allowing for effective data driven neural approaches is having a major impact on machine learning and information retrieval research, but these models have a basic problem with efficiency. Current neural ranking models are implemented as multistage rankers: for efficiency reasons, the neural mode...
Data
In this data collection, we are particularly interested in providing the first dataset focusing on a unified search framework for mobile devices by collecting cross-app mobile queries as well as their target apps. To this end, we recruited 255 participants through an open online call asking them to install uSearch on their smartphones and let it ru...
Data
In this data collection, we are particularly interested in providing the first dataset focusing on a unified search framework for mobile devices by collecting cross-app mobile queries as well as their target apps. To this end, we initially asked crowdworkers to explain their latest search experience on their smartphones and used them to define vari...
Preprint
Full-text available
The ACM Recommender Systems Challenge 2018 focused on the task of automatic music playlist continuation, which is a form of the more general task of sequential recommendation. Given a playlist of arbitrary length with some additional meta-data, the task was to recommend up to 500 tracks that fit the target characteristics of the original playlist....
Article
Full-text available
The rapid growth of the Web has increased the difficulty of finding the information that can address the users’ information needs. A number of recommendation approaches have been developed to tackle this problem. The increase in the number of data providers has necessitated the development of multi-publisher recommender systems; systems that includ...
Conference Paper
The ACM Recommender Systems Challenge 2018 focused on automatic music playlist continuation, which is a form of the more general task of sequential recommendation. Given a playlist of arbitrary length, the challenge was to recommend up to 500 tracks that fit the target characteristics of the original playlist. For the Challenge, Spotify released a...
Conference Paper
Neural network approaches have recently shown to be effective in several information retrieval (IR) tasks. However, neural approaches often require large volumes of training data to perform effectively, which is not always available. To mitigate the shortage of labeled data, training neural IR models with weak supervision has been recently proposed...
Preprint
Despite the somewhat different techniques used in developing search engines and recommender systems, they both follow the same goal: helping people to get the information they need at the right time. Due to this common goal, search and recommendation models can potentially benefit from each other. The recent advances in neural network technologies...
Conference Paper
Full-text available
Does this sentence need citation? In this paper, we introduce the task of citation worthiness for scientific texts at a sentence-level granularity. The task is to detect whether a sentence in a scientific article needs to be cited or not. It can be incorporated into citation recommendation systems to help automate the citation process by marking se...
Conference Paper
Predicting the performance of a search engine for a given query is a fundamental and challenging task in information retrieval. Accurate performance predictors can be used in various ways, such as triggering an action, choosing the most effective ranking function per query, or selecting the best variant from multiple query formulations. In this pap...
Conference Paper
Axiomatic analysis is a well-defined theoretical framework for analytical evaluation of information retrieval models. The current studies in axiomatic analysis implicitly assume that the constraints (axioms) are independent. In this paper, we revisit this assumption and hypothesize that there might be interdependence relationships between the exist...
Conference Paper
Learning to rank is a key component of modern information retrieval systems. Recently, regression forest models (i.e., random forests, LambdaMART and gradient boosted regression trees) have come to dominate learning to rank systems in practice, as they provide the ability to learn from large scale data while generalizing well to additional test que...