Conference Paper

Suggesting Topic-Based Query Terms as You Type.

DOI: 10.1109/APWeb.2010.13 Conference: Advances in Web Technologies and Applications, Proceedings of the 12th Asia-Pacific Web Conference, APWeb 2010, Busan, Korea, 6-8 April 2010
Source: DBLP

ABSTRACT Query term suggestion that interactively expands the queries is an indispensable technique to help users formulate high-quality queries and has attracted much attention in the community of web search. Existing methods usually suggest terms based on statistics in documents as well as query logs and external dictionaries, and they neglect the fact that the topic information is very crucial because it helps retrieve topically relevant documents. To give users gratification, we propose a novel term suggestion method: as the user types in queries letter by letter, we suggest the terms that are topically coherent with the query and could retrieve relevant documents instantly. For effectively suggesting highly relevant terms, we propose a generative model by incorporating the topical coherence of terms. The model learns the topics from the underlying documents based on Latent Dirichlet Allocation (LDA). For achieving the goal of instant query suggestion, we use a trie structure to index and access terms. We devise an efficient top-k algorithm to suggest terms as users type in queries. Experimental results show that our approach not only improves the effectiveness of term suggestion, but also achieves better efficiency and scalability.

0 Bookmarks
 · 
130 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: Keyword-based Web search is a widely used approach for locating information on the Web. However, Web users usually suffer from the difficulties of organizing and formulating appropriate input queries due to the lack of sufficient domain knowledge, which greatly affects the search performance. An effective tool to meet the information needs of a search engine user is to suggest Web queries that are topically related to their initial inquiry. Accurately computing query-to-query similarity scores is a key to improve the quality of these suggestions. Because of the short lengths of queries, traditional pseudo-relevance or implicit-relevance based approaches expand the expression of the queries for the similarity computation. They explicitly use a search engine as a complementary source and directly extract additional features (such as terms or URLs) from the top-listed or clicked search results. In this paper, we propose a novel approach by utilizing the hidden topic as an expandable feature. This has two steps. In the offline model-learning step, a hidden topic model is trained, and for each candidate query, its posterior distribution over the hidden topic space is determined to re-express the query instead of the lexical expression. In the online query suggestion step, after inferring the topic distribution for an input query in a similar way, we then calculate the similarity between candidate queries and the input query in terms of their corresponding topic distributions; and produce a suggestion list of candidate queries based on the similarity scores. Our experimental results on two real data sets show that the hidden topic based suggestion is much more efficient than the traditional term or URL based approach, and is effective in finding topically related queries for suggestion.
    World Wide Web 05/2013; 16(3). · 1.20 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Autocompletion systems support users in the formulation of queries in different computer systems, from development environments to the web. In this paper we describe Composite Match Autocompletion (COMMA), a lightweight approach to the introduction of semantics in the realization of a semi-structured data auto completion matching algorithm. The approach is formally described, then it is applied and evaluated with specific reference to the e-commerce context. The semantic extension to the matching algorithm exploits available information about product categories and distinguishing features of products to enhance the elaboration of exploratory queries. COMMA supports a seamless management of both targeted/precise queries and exploratory/vague ones, combining different filtering and scoring techniques. The algorithm is evaluated with respect both to effectiveness and efficiency in a real-world scenario: the achieved improvement is significant and not associated to a sensible increase of computational costs.
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on; 01/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Term suggestions recommend query terms to a user based on his initial query. Suggesting adequate terms is a challenging issue. Most existing commercial search engines suggest search terms based on the frequency of prior used terms that match the leading alphabets the user types. In this article, we present a novel mechanism to construct semantic term-relation graphs to suggest relevant search terms in the semantic level. We built term-relation graphs based on multipartite networks of existing social media, especially from Wikipedia. The multipartite linkage networks of contributor-term, term-category, and term-term are extracted from Wikipedia to eventually form term relation graphs. For fusing these multipartite linkage networks, we propose to incorporate the contributor-category networks to model the expertise of the contributors. Based on our experiments, this step has demonstrated clear enhancement on the accuracy of the inferred relatedness of the term-semantic graphs. Experiments on keyword-expanded search based on 200 TREC-5 ad-hoc topics showed obvious advantage of our algorithms over existing approaches.
    ACM Transactions on Intelligent Systems and Technology (TIST). 12/2013; 5(1).