The retrieval of sentences that are relevant to a given information need is a challenging passage retrieval task. In this
context, the well-known vocabulary mismatch problem arises severely because of the fine granularity of the task. Short queries,
which are usually the rule rather than the exception, aggravate the problem. Consequently, effective sentence retrieval methods
tend to apply some form of query expansion, usually based on pseudo-relevance feedback. Nevertheless, there are no extensive
studies comparing different statistical expansion strategies for sentence retrieval. In this work we study thoroughly the
effect of distinct statistical expansion methods on sentence retrieval. We start from a set of retrieved documents in which
relevant sentences have to be found. In our experiments different term selection strategies are evaluated and we provide empirical
evidence to show that expansion before sentence retrieval yields competitive performance. This is particularly novel because expansion for sentence retrieval is often done after sentence retrieval (i.e. expansion terms are mined from a ranked set of sentences) and there are no comparative results available between both
types of expansion. Furthermore, this comparison is particularly valuable because there are important implications in time
efficiency. We also carefully analyze expansion on weak and strong queries and demonstrate clearly that expanding queries
before sentence retrieval is not only more convenient for efficiency purposes, but also more effective when handling poor
queries.
Figures - uploaded by
David E. LosadaAuthor contentAll figure content in this area was uploaded by David E. Losada
Content may be subject to copyright.