Ingmar Weber

Max-Planck-Institut für Informatik, Saarbrücken, Saarland, Germany

Are you Ingmar Weber?

Claim your profile

Publications (22)0.91 Total impact

  • Source
    Article: Output-sensitive autocompletion search
    [show abstract] [hide abstract]
    ABSTRACT: We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w,d) from the collection such that w ∈W and d∈D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.
    Information Retrieval 07/2008; 11(4):269-286. · 0.91 Impact Factor
  • Article: Effiziente und Proaktive Suche.
    Holger Bast, Ingmar Weber
    KI. 01/2008; 22:58-61.
  • Conference Proceeding: Efficient interactive query expansion with complete search.
    Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007, Lisbon, Portugal, November 6-10, 2007; 01/2007
  • Chapter: Sequences Characterizing k-Trees
    [show abstract] [hide abstract]
    ABSTRACT: A non-decreasing sequence of n integers is the degree sequence of a 1-tree (i.e., an ordinary tree) on n vertices if and only if there are least two 1’s in the sequence, and the sum of the elements is 2(n–1). We generalize this result in the following ways. First, a natural generalization of this statement is a necessary condition for k-trees, and we show that it is not sufficient for any k > 1. Second, we identify non-trivial sufficient conditions for the degree sequences of 2-trees. We also show that these sufficient conditions are almost necessary using bounds on the partition function p(n) and probabilistic methods. Third, we generalize the characterization of degrees of 1-trees in an elegant and counter-intuitive way to yield integer sequences that characterize k-trees, for all k.
    11/2006: pages 216-225;
  • Chapter: Output-Sensitive Autocompletion Search
    [show abstract] [hide abstract]
    ABSTRACT: We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w,d) from the collection such that w ∈W and d∈D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.
    09/2006: pages 150-162;
  • Conference Proceeding: Type less, find more: fast autocompletion search with a succinct index.
    Holger Bast, Ingmar Weber
    SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, Washington, USA, August 6-11, 2006; 01/2006
  • Conference Proceeding: Output-Sensitive Autocompletion Search.
    String Processing and Information Retrieval, 13th International Conference, SPIRE 2006, Glasgow, UK, October 11-13, 2006, Proceedings; 01/2006
  • Conference Proceeding: Sequences Characterizing
    Computing and Combinatorics, 12th Annual International Conference, COCOON 2006, Taipei, Taiwan, August 15-18, 2006, Proceedings; 01/2006
  • Chapter: Don’t Compare Averages
    Holger Bast, Ingmar Weber
    [show abstract] [hide abstract]
    ABSTRACT: We point out that for two sets of measurements, it can happen that the average of one set is larger than the average of the other set on one scale, but becomes smaller after a non-linear monotone transformation of the individual measurements. We show that the inclusion of error bars is no safeguard against this phenomenon. We give a theorem, however, that limits the amount of “reversal” that can occur; as a by-product we get two non-standard one-sided tail estimates for arbitrary random variables which may be of independent interest. Our findings suggest that in the not infrequent situation where more than one cost measure makes sense, there is no alternative other than to explicitly compare averages for each of them, much unlike what is common practice.
    05/2005: pages 295-304;
  • Chapter: Don't Compare Averages
    Holger Bast, Ingmar Weber
    01/2005: pages 67-76;
  • Source
    Conference Proceeding: Don't Compare Averages.
    Holger Bast, Ingmar Weber
    Experimental and Efficient Algorithms, 4th InternationalWorkshop, WEA 2005, Santorini Island, Greece, May 10-13, 2005, Proceedings; 01/2005
  • Article: Don't Compare Averages
    4th International Workshop on Efficient and Experimental Algorithms (WEA'05), Springer, 67-76 (2005).
  • Article: Insights from Viewing Ranked Retrieval as Rank Aggregation
    Workshop on Challenges in Web Information Retrieval and Integration (WIRI'05), IEEE, 243-248 (2005).
  • Article: Managing Helpdesk Tasks with CompleteSearch: A Case Study
    Holger Bast, Ingmar Weber
    [show abstract] [hide abstract]
    ABSTRACT: CompleteSearch is a highly interactive search engine, which, instantly after every single keystroke, offers to the user various kinds of feedback, like promising query completions or refinements by category. We combined CompleteSearch with our institute's helpdesk system and carried out a small user study with some of the staff operating the helpdesk. Participants were asked to process ten typical helpdesk requests, alternatingly using CompleteSearch and the off-the-shelf Google Desktop Search. All participants preferred CompleteSearch over Google Desktop, mainly because of its speed, the feeling of being in power, and the enhanced search facilities.
    Gronau, Norbert: 4th Conference on Professional Knowledge Management (WM'07). - Bd. 2, GITO, 101-108 (2007).
  • Article: Efficient interactive query expansion with CompleteSearch
    [show abstract] [hide abstract]
    ABSTRACT: We present an efficient realization of the following interactive search engine feature: as the user is typing the query, words that are related to the last query word and that would lead to good hits are suggested, as well as selected such hits. The realization has three parts: (i) building clusters of related terms, (ii) adding this information as artificial words to the index such that (iii) the described feature reduces to an instance of prefix search and completion. An efficient solution for the latter is provided by the CompleteSearch engine, with which we have integrated the proposed feature. For building the clusters of related terms we propose a variant of latent semantic indexing that, unlike standard approaches, is completely transparent to the user. By experiments on two large test-collections, we demonstrate that the feature is provided at only a slight increase in query processing time and index size.
    Silva, Mário J.; Laender, Alberto A. F.; Baeza-Yates, Ricardo; McGuinness, Deborah L.; Olstad, Bjorn; Olsen, Øystein Haug; Falcão, André O.: CIKM'07 : Proceedings of the 2007 ACM Conference on Information and Knowledge Management, ACM, 857-860 (2007).
  • Source
    Article: Type Less, Find More: Fast Autocompletion Search with a Succinct Index
    [show abstract] [hide abstract]
    ABSTRACT: We consider the following full-text search autocompletion feature. Imagine a user of a search engine typing a query. Then with every letter being typed, we would like an instant display of completions of the last query word which would lead to good hits. At the same time, the best hits for any of these completions should be displayed. Known indexing data structures that apply to this problem either incur large processing times for a substantial class of queries, or they use a lot of space. We present a new indexing data structure that uses no more space than a state-of-the-art compressed inverted index, but with 10 times faster query processing times. Even on the large TREC Terabyte collection, which comprises over 25 million documents, we achieve, on a single machine and with the index on disk, average response times of one tenth of a second. We have built a full-fledged, interactive search engine that realizes the proposed autocompletion feature combined with support for proximity search, semi-structured (XML) text, subword and phrase completion, and semantic tags.
    SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, 364-371 (2006).
  • Source
    Article: When You're Lost For Words: Faceted Search With Autocompletion
    SIGIR'06 Workshop on Faceted Search, ACM, 31-35 (2006).
  • Source
    Article: Sequences Characterizing k-Trees
    [show abstract] [hide abstract]
    ABSTRACT: A non-decreasing sequence of n integers is the degree sequence of a 1-tree (i.e., an ordinary tree) on n vertices if and only if there are least two 1’s in the sequence, and the sum of the elements is 2(n–1). We generalize this result in the following ways. First, a natural generalization of this statement is a necessary condition for k-trees, and we show that it is not sufficient for any k > 1. Second, we identify non-trivial sufficient conditions for the degree sequences of 2-trees. We also show that these sufficient conditions are almost necessary using bounds on the partition function p(n) and probabilistic methods. Third, we generalize the characterization of degrees of 1-trees in an elegant and counter-intuitive way to yield integer sequences that characterize k-trees, for all k.
    Computing and Combinatorics, 12th Annual International Conference, COCOON 2006, Springer, 216-225 (2006).
  • Source
    Article: The CompleteSearch Engine: Interactive, Efficient, and Towards IR & DB integration
    Holger Bast, Ingmar Weber
    [show abstract] [hide abstract]
    ABSTRACT: We describe CompleteSearch, an interactive search engine that offers the user a variety of complex features, which at first glance have little in common, yet are all provided via one and the same highly optimized core mechanism. This mechanism answers queries for what we call context-sensitive prefix search and completion: given a set of documents and a word range, compute all words from that range which are contained in one of the given documents, as well as those of the given documents which contain a word from the given range. Among the supported features are: (i) automatic query completion, for example, find all completions of the prefix “seman” that occur in the context of the word “ontology”, as well as the best hits for any such completion; (ii) semi-structured (XML) retrieval, for example, find all emailmessages with “dbworld” in the subject line; (iii) semantic search, for example, find all politicians which had a private audience with the pope; (iv) DB-style joins and grouping, for example, find the most prolific authors with at least one paper in both “SIGMOD” and “SIGIR”; and (v) arbitrary combinations of these. The prefix search and completion mechanism of Complete- Search is realized via a novel kind of index data structure, which enables subsecond query processing times for collections up to a terabyte of data, on a single PC. We report on a number of lessons learned in the process of building the system and on our experience with a number of publicly used deployments.
    CIDR 2007 : 3rd Biennial Conference on Innovative Data Systems Research, University of Wisconsin / Computer Science Department, 88-95 (2007).
  • Source
    Article: Output-Sensitive Autocompletion Search
    [show abstract] [hide abstract]
    ABSTRACT: We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query word that would lead to the best hits, and also display the best such hits. The following problem is at the core of this feature: for a fixed document collection, given a set D of documents, and an alphabetical range W of words, compute the set of all word-in-document pairs (w ,d) from the collection such that w ∈W and d∈D. We present a new data structure with the help of which such autocompletion queries can be processed, on the average, in time linear in the input plus output size, independent of the size of the underlying document collection. At the same time, our data structure uses no more space than an inverted index. Actual query processing times on a large test collection correlate almost perfectly with our theoretical bound.
    String Processing and Information Retrieval : 13th International Conference, SPIRE 2006, Springer, 150-162 (2006).