Eugene Agichtein

Eugene Agichtein
Emory University | EU · Department of Mathematics and Computer Science

About

170
Publications
35,299
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
9,657
Citations
Additional affiliations
September 2006 - present
Emory University
Position
  • Professor (Associate)

Publications

Publications (170)
Preprint
Full-text available
Since its inception in 2016, the Alexa Prize program has enabled hundreds of university students to explore and compete to develop conversational agents through the SocialBot Grand Challenge. The goal of the challenge is to build agents capable of conversing coherently and engagingly with humans on popular topics for 20 minutes, while achieving an...
Article
Full-text available
A key application of conversational search is refining a user’s search intent by asking a series of clarification questions, aiming to improve the relevance of search results. Training and evaluating such conversational systems currently requires human participation, making it infeasible to examine a wide range of user behaviors. To support robust...
Preprint
Current interactive systems with natural language interface lack an ability to understand a complex information-seeking request which expresses several implicit constraints at once, and there is no prior information about user preferences, e.g., "find hiking trails around San Francisco which are accessible with toddlers and have beautiful scenery i...
Preprint
Full-text available
To support complex search tasks, where the initial information requirements are complex or may change during the search, a search engine must adapt the information delivery as the user's information requirements evolve. To support this dynamic ranking paradigm effectively, search result ranking must incorporate both the user feedback received, and...
Preprint
Full-text available
In search and recommendation, diversifying the multi-aspect search results could help with reducing redundancy, and promoting results that might not be shown otherwise. Many previous methods have been proposed for this task. However, previous methods do not explicitly consider the uniformity of the number of the items' classes, or evenness, which c...
Preprint
Full-text available
Users' clicks on Web search results are one of the key signals for evaluating and improving web search quality and have been widely used as part of current state-of-the-art Learning-To-Rank(LTR) models. With a large volume of search logs available for major search engines, effective models of searcher click behavior have emerged to evaluate and tra...
Preprint
Mapping a search query to a set of relevant categories in the product taxonomy is a significant challenge in e-commerce search for two reasons: 1) Training data exhibits severe class imbalance problem due to biased click behavior, and 2) queries with little customer feedback (e.g., \textit{tail} queries) are not well-represented in the training set...
Preprint
Query categorization is an essential part of query intent understanding in e-commerce search. A common query categorization task is to select the relevant fine-grained product categories in a product taxonomy. For frequent queries, rich customer behavior (e.g., click-through data) can be used to infer the relevant product categories. However, for m...
Preprint
Retrieving all semantically relevant products from the product catalog is an important problem in E-commerce. Compared to web documents, product catalogs are more structured and sparse due to multi-instance fields that encode heterogeneous aspects of products (e.g. brand name and product dimensions). In this paper, we propose a new semantic product...
Preprint
Predicting user satisfaction in conversational systems has become critical, as spoken conversational assistants operate in increasingly complex domains. Online satisfaction prediction (i.e., predicting satisfaction of the user with the system after each turn) could be used as a new proxy for implicit user feedback, and offers promising opportunitie...
Preprint
As voice-based assistants such as Alexa, Siri, and Google Assistant become ubiquitous, users increasingly expect to maintain natural and informative conversations with such systems. However, for an open-domain conversational system to be coherent and engaging, it must be able to maintain the user's interest for extended periods, without sounding bo...
Preprint
One of the key benefits of voice-based personal assistants is the potential to proactively recommend relevant and interesting information. One of the most valuable sources of such information is the News. However, in order for the user to hear the news that is useful and relevant to them, it must be recommended in an interesting and informative way...
Preprint
Full-text available
Classifying the general intent of the user utterance in a conversation, also known as Dialogue Act (DA), e.g., open-ended question, statement of opinion, or request for an opinion, is a key step in Natural Language Understanding (NLU) for conversational agents. While DA classification has been extensively studied in human-human conversations, it ha...
Preprint
Full-text available
An accurate understanding of a user's query intent can help improve the performance of downstream tasks such as query scoping and ranking. In the e-commerce domain, recent work in query understanding focuses on the query to product-category mapping. But, a small yet significant percentage of queries (in our website 1.5% or 33M queries in 2019) have...
Preprint
Full-text available
To hold a true conversation, an intelligent agent should be able to occasionally take initiative and recommend the next natural conversation topic. This is a challenging task. A topic suggested by the agent should be relevant to the person, appropriate for the conversation context, and the agent should have something interesting to say about it. Th...
Preprint
Full-text available
Identifying the topic (domain) of each user's utterance in open-domain conversational systems is a crucial step for all subsequent language understanding and response tasks. In particular, for complex domains, an utterance is often routed to a single component responsible for that domain. Thus, correctly mapping a user utterance to the right domain...
Conference Paper
Full-text available
To hold a true conversation, an intelligent agent should be able to occasionally take initiative and recommend the next natural conversation topic. This is a challenging task. A topic suggested by the agent should be relevant to the person, appropriate for the conversation context, and the agent should have something interesting to say about it. Th...
Conference Paper
Predicting user satisfaction in conversational systems has become critical, as spoken conversational assistants operate in increasingly complex domains. Online satisfaction prediction (i.e., predicting satisfaction of the user with the system after each turn) could be used as a new proxy for implicit user feedback, and offers promising opportunitie...
Conference Paper
Identifying the topic (domain) of each user's utterance in open-domain conversational systems is a crucial step for all subsequent language understanding and response tasks. In particular, for complex domains, an utterance is often routed to a single component responsible for that domain. Thus, correctly mapping a user utterance to the right domain...
Conference Paper
Result ranking diversification has become an important issue for web search, summarization, and question answering. For more complex questions with multiple aspects, such as those in community-based question answering (CQA) sites, a retrieval system should provide a diversified set of relevant results, addressing the different aspects of the query,...
Article
We show that incorporating user behavior data can significantly improve ordering of top results in real web search setting. We examine alternatives for incorporating feedback into the ranking process and explore the contributions of user feedback compared to other common web search features. We report results of a large scale evaluation over 3,000...
Conference Paper
Full-text available
While activity-aware music recommendation has been shown to improve the listener experience, we posit that modeling the \em listening intent can further improve recommendation quality. In this paper, we perform initial exploration of the dominant music listening intents associated with common activities, using music retrieved from popular online mu...
Conference Paper
Full-text available
Why do we listen to music? This question has as many answers as there are people, which may vary by time of day, and the activity of the listener. We envision a contextual music search and recommendation system, which could suggest appropriate music to the user in the current context. As an important step in this direction, we set out to understand...
Conference Paper
Full-text available
Chatbots and conversational assistants are becoming increasingly popular. However, for information seeking scenarios, these systems still have very limited conversational abilities, and primarily serve as proxies to existing web search engines. In this work, we ask: what would conversational search look like with a truly intelligent assistant? To b...
Conference Paper
Search as a dialogue is an emerging paradigm that is fueled by the proliferation of mobile devices and technological advances, e.g. in speech recognition and natural language processing. Such an interface allows search systems to engage in a dialogue with users aimed at fulfilling their information needs. One key capability required to make such se...
Conference Paper
One of the major challenges for automated question answering over Knowledge Bases (KBQA) is translating a natural language question to the Knowledge Base (KB) entities and predicates. Previous systems have used a limited amount of training data to learn a lexicon that is later used for question answering. This approach does not make use of other po...
Conference Paper
Full-text available
Web search engines have made great progress at answering factoid queries. However, they are not well-tailored for managing more complex questions, especially when they require explanation and/or description. The WebQA workshop series aims at exploring diverse approaches to answering questions on the Web. This year, particular emphasis will be given...
Conference Paper
Social media gives voice to the people, but also opens the door to low-quality contributions, which degrade the experience for the majority of users. To address the latter issue, the prevailing solution is to rely on the ”wisdom of the crowds” to promote good content (e.g., via votes or ”like” buttons), or to downgrade bad content. Unfortun...
Conference Paper
Full-text available
Modern search engines have made dramatic progress in the answering of many user’s questions about facts, such as those that might be retrieved or directly inferred from a knowledge base. However, many other questions that real users ask are more complex, such as asking for opinions or advice for a particular situation, and are still largely beyond...
Conference Paper
Full-text available
With mobile devices, web search is no longer limited to specific locations. People conduct search from practically anywhere, including at home, at work, when traveling and when on vacation. How should this influence search tools and web services? In this paper, we argue that information needs are affected by the familiarity of the environment. To f...
Conference Paper
Modeling and predicting user attention is crucial for interpreting search behavior. The numerous applications include quantifying web search satisfaction, estimating search quality, and measuring and predicting online user engagement. While prior research has demonstrated the value of mouse cursor data and other interactions as a rough proxy of use...
Conference Paper
Full-text available
Although topic models designed for textual collections annotated with geographical meta-data have been previously shown to be effective at capturing vocabulary preferences of people living in different geographical regions, little is known about their utility for information retrieval in general or microblog retrieval in particular. In this work, w...
Article
Previous studies of online user attention during information seeking tasks have mainly focused on analyzing searcher behavior in the web search settings. While these studies enabled better understanding of search result examination, their findings might not generalize for the tasks and search interfaces in other domains such as Shopping or Social M...
Article
Extensive previous research has shown that searchers often require assistance with query formulation and refinement. Yet, it is not clear what kind of assistance is most useful, and how effective it is both objectively (e.g., in terms of task success) and subjectively (e.g., in terms of searcher percep- tion of the search difficulty). This work des...
Conference Paper
Extensive previous research has shown that searchers often require assistance with query formulation and refinement. Yet, it is not clear what kind of assistance is most useful, and how effective it is both objectively (e.g., in terms of task success) and subjectively (e.g., in terms of searcher percep- tion of the search difficulty). This work des...
Article
Full-text available
The largest publicly available knowledge repositories, such as Wikipedia and Freebase, owe their existence and growth to volunteer contributors around the globe. While the majority of contributions are correct, errors can still creep in, due to editors' carelessness, misunderstanding of the schema, malice, or even lack of accepted ground truth. If...
Conference Paper
Web search behavior and interaction data, such as mouse cursor movements, can provide valuable information on how searchers examine and engage with the web search results. This interaction data is far richer than traditional search click data, and can be used to improve search ranking, evaluation, and presentation. Unfortunately, the diversity and...
Conference Paper
Full-text available
Entity ranking has become increasingly important, both for retrieving structured entities and for use in general web search applications. The most common format for linked data, RDF graphs, provide extensive semantic structure via predicate links. While the semantic information is potentially valuable for effective search, the resulting adjacency m...
Conference Paper
Full-text available
Social media users create virtual connections for various reasons: personal and professional. While significant research efforts have been spent on exploring the dynamics of creation of social network connections, little is known about how those connections influence the content generated by social media users. In this work, we quantitatively evalu...
Conference Paper
Query-biased search result summaries, or "snippets", help users decide whether a result is relevant for their information need, and have become increasingly important for helping searchers with difficult or ambiguous search tasks. Previously published snippet generation algorithms have been primarily based on selecting document fragments most simil...
Conference Paper
This workshop brings together researchers and practitioners from industry and academia to discuss search and discovery in the medi-cal domain. The event focuses on ways to make medical and health information more accessible to laypeople (including enhancements to ranking algorithms and search interfaces), and how we can dis-cover new medical facts...
Article
Fine-grained search interactions in the desktop setting, such as mouse cursor movements and scrolling, have been shown valuable for understanding user intent, attention, and their preferences for Web search results. As web search on smart phones and tablets becomes increasingly popular, previously validated desktop interaction models have to be ada...
Conference Paper
Full-text available
We propose the methods for document, query and relevance model expansion that leverage geographical metadata provided by social media. In particular, we propose a geographically-aware extension of the LDA topic model and utilize the resulting topics and language models in our expansion methods. The proposed approach has been experimentally evaluate...
Conference Paper
Fine-grained search interactions such as mouse cursor movements and scrolling have been shown to be valuable for modeling user attention and preferences of Web search results, in the desktop setting. However, users increasingly search the Web on touch-enabled devices such as smart phones and tablets, where they zoom and swipe instead of mousing and...
Article
The workshop brought together 40 researchers and practitioners from academia and industry to discuss search and discovery in the medical domain. Presentations and discussions spanned several challenging and important topics, including directions improving the accessibility of medical and health information for lay people (with associated enhancemen...
Conference Paper
Smartphones, and other similar devices, are ideal for private activities: They are carried on the body and can be physically secured from prying eyes. They work anywhere. And they are not typically shared among family members or classmates, making them perfect for searching and exploring sensitive information — both via web search or through social...
Article
Passage retrieval is a crucial first step of automatic Question Answering (QA). While existing passage retrieval algorithms are effective at selecting document passages most similar to the question, or those that contain the expected answer types, they do not take into account which parts of the document the searchers actually found useful. We prop...
Conference Paper
Detecting and predicting searcher success is essential for automatically evaluating and improving Web search engine performance. In the past, Web searcher behavior data, such as result clickthrough, dwell time, and query reformulation sequences, have been successfully used for a variety of tasks, including prediction of success in a search session....
Article
While Web search has become increasingly effective over the last decade, for many users' needs the required answers may be spread across many documents, or may not exist on the Web at all. Yet, many of these needs could be addressed by asking people via popular Community Question Answering (CQA) services, such as Baidu Knows, Quora, or Yahoo! Answe...
Article
Previous studies of web search result examination have provided valuable insights in understanding and modelling searcher behavior. Yet, recent work (e.g., [3]) has been developed based on the assumption that the time a searcher spends examining a particular result abstract or snippet, correlates with result relevance. While this idea is intuitivel...
Article
Full-text available
Many important search tasks require multiple search sessions to complete. Tasks such as travel planning, large purchases, or job searches can span hours, days, or even weeks. Inevitably, life interferes, requiring the searcher either to recover the "state" of the search manually (most common), or plan for interruption in advance (unlikely). The goa...
Article
Many questions submitted to Collaborative Question Answering (CQA) sites have similar questions answered before. We propose a precise approach of automatically finding an answer to such questions by automatically identifying “equivalent” questions submitted and answered, in the past. Our method is based on automatically generating equivalent questi...
Article
Full-text available
Latent topic analysis has emerged as one of the most effective methods for classifying, clustering and retrieving textual data. However, existing models such as Latent Dirichlet Allocation (LDA) were developed for static corpora of relatively large documents. In contrast, much of the textual content on the web, and especially social media, is tempo...
Article
Full-text available
Result clickthrough statistics and dwell time on clicked results have been shown valuable for inferring search result relevance, but the interpretation of these signals can vary substantially for different tasks and users. This paper shows that that post-click searcher behavior, such as cursor movement and scrolling, provides additional clues for b...
Article
Full-text available
Clickthrough on search results have been successfully used to infer user interest and preferences, but are often noisy and potentially ambiguous. The reason mainly lies in that the clickthrough features are inherently a rep-resentation of the majority of user intents, rather than the information needs of the individual users for a given query insta...
Conference Paper
Semantically similar questions are submitted to collaborative question answering systems repeatedly even though these questions already contain best answers before. To solve the problem, we propose a precise approach of automatically finding an answer to such questions by identifying "equivalent" questions submitted and answered. Our method is base...
Conference Paper
Proliferation of ubiquitous access to the Internet enables millions of Web users to collaborate online on a variety of activities. Many of these activities result in the construction of large repositories of knowledge, either as their primary aim (e.g., Wikipedia) or as a by-product (e.g., Yahoo! Answers). In this tutorial, we will discuss organizi...
Article
In this paper, we highlight the potential of n-grams as a vehicle to explore the ‘evolution’ of the law and legal language. Using the full text corpus of decisions of the United States Supreme Court (1791-2005), we explore the n-gram space, offer some initial results based upon our calculations and highlight the beta version of our n-gram search in...
Conference Paper
Full-text available
Most of the information on the Web is inherently structured, product pages of large online shopping sites such as Amazon.com being a typical example. Yet, unstructured keyword queries are still the most common way to search for such structured information, producing an ambiguities and poor ranking, and by that degrading user experience. This proble...
Article
The Visual Paired Comparison (VPC) task is a recognition memory test that has shown promise for the detection of memory impairments associated with mild cognitive impairment (MCI). Because patients with MCI often progress to Alzheimer's Disease (AD), the VPC may be useful in predicting the onset of AD. VPC uses noninvasive eye tracking to identify...
Conference Paper
Full-text available
A key functionality in Collaborative Question Answering (CQA) systems is the assignment of the questions from information seekers to the potential answerers. An attractive solution is to automatically recommend the questions to the potential answerers with expertise or interest in the question topic. However, previous work has largely ignored a key...