Suggesting Topic-Based Query Terms as You Type.
ABSTRACT Query term suggestion that interactively expands the queries is an indispensable technique to help users formulate high-quality queries and has attracted much attention in the community of web search. Existing methods usually suggest terms based on statistics in documents as well as query logs and external dictionaries, and they neglect the fact that the topic information is very crucial because it helps retrieve topically relevant documents. To give users gratification, we propose a novel term suggestion method: as the user types in queries letter by letter, we suggest the terms that are topically coherent with the query and could retrieve relevant documents instantly. For effectively suggesting highly relevant terms, we propose a generative model by incorporating the topical coherence of terms. The model learns the topics from the underlying documents based on Latent Dirichlet Allocation (LDA). For achieving the goal of instant query suggestion, we use a trie structure to index and access terms. We devise an efficient top-k algorithm to suggest terms as users type in queries. Experimental results show that our approach not only improves the effectiveness of term suggestion, but also achieves better efficiency and scalability.
- SourceAvailable from: Giuseppe Vizzari[Show abstract] [Hide abstract]
ABSTRACT: Autocompletion systems support users in the formulation of queries in different situations, from development environments to the web. In this paper we describe Composite Match Autocompletion COMMA, a lightweight approach to the introduction of semantics in the realization of a semi-structured data autocompletion matching algorithm. The approach is formally described, then it is applied and evaluated with specific reference to the e-commerce context. The semantic extension to the matching algorithm exploits available information about product categories and distinguishing features of products to enhance the elaboration of exploratory queries. COMMA supports a seamless management of both targeted/precise queries and exploratory/vague ones, combining different filtering and scoring techniques. The algorithm is evaluated with respect both to effectiveness and efficiency in a real-world scenario: the achieved improvement is significant and it is not associated to a sensible increase of computational costs.Web Intelligence and Agent Systems 01/2014; 12(1):35-49.
- [Show abstract] [Hide abstract]
ABSTRACT: As an important operation for finding existing relevant patents and validating a new patent application, patent search has attracted considerable attention recently. However, many users have limited knowledge about the underlying patents, and they have to use a try-and-see approach to repeatedly issue different queries and check answers, which is a very tedious process. To address this problem, in this paper, we propose a new user-friendly patent search paradigm, which can help users find relevant patents more easily and improve user search experience. We propose three effective techniques, error correction, topic-based query suggestion, and query expansion, to improve the usability of patent search. We also study how to efficiently find relevant answers from a large collection of patents. We first partition patents into small partitions based to their topics and classes. Then, given a query, we find highly relevant partitions and answer the query in each of such highly relevant partitions. Finally, we combine the answers of each partition and generate top-k answers of the patent-search query.IEEE Transactions on Knowledge and Data Engineering 06/2013; 25(6):1439-1443. DOI:10.1109/TKDE.2012.63 · 1.82 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Term suggestions recommend query terms to a user based on his initial query. Suggesting adequate terms is a challenging issue. Most existing commercial search engines suggest search terms based on the frequency of prior used terms that match the leading alphabets the user types. In this article, we present a novel mechanism to construct semantic term-relation graphs to suggest relevant search terms in the semantic level. We built term-relation graphs based on multipartite networks of existing social media, especially from Wikipedia. The multipartite linkage networks of contributor-term, term-category, and term-term are extracted from Wikipedia to eventually form term relation graphs. For fusing these multipartite linkage networks, we propose to incorporate the contributor-category networks to model the expertise of the contributors. Based on our experiments, this step has demonstrated clear enhancement on the accuracy of the inferred relatedness of the term-semantic graphs. Experiments on keyword-expanded search based on 200 TREC-5 ad-hoc topics showed obvious advantage of our algorithms over existing approaches.ACM Transactions on Intelligent Systems and Technology 12/2013; 5(1). DOI:10.1145/2542182.2542201 · 9.39 Impact Factor