ArticlePDF Available

Abstract

Hundreds of millions of users each day use web search engines to meet their information needs. Advances in web search e#ectiveness are therefore perhaps the most significant public outcomes of IR research. Query expansion is one such method for improving the e#ectiveness of ranked retrieval by adding additional terms to a query. In previous approaches to query expansion, the additional terms are selected from highly ranked documents returned from an initial retrieval run. We propose a new method of obtaining expansion terms, based on selecting terms from past user queries that are associated with documents in the collection.
Query Expansion using Associated Queries
Bodo Billerbeck Falk Scholer Hugh E. Williams Justin Zobel
School of Computer Science and Information Technology
RMIT University, GPO Box 2476V
Melbourne, Australia, 3001.
{bodob, fscholer, hugh, jz}@cs.rmit.edu.au
ABSTRACT
Hundreds of millions of users each day use web search en-
gines to meet their information needs. Advances in web
search effectiveness are therefore perhaps the most signifi-
cant public outcomes of IR research. Query expansion is
one such method for improving the effectiveness of ranked
retrieval by adding additional terms to a query. In pre-
vious approaches to query expansion, the additional terms
are selected from highly ranked documents returned from
an initial retrieval run. We propose a new method of ob-
taining expansion terms, based on selecting terms from past
user queries that are associated with documents in the col-
lection. Our scheme is effective for query expansion for web
retrieval: our results show relative improvements over unex-
panded full text retrieval of 26%–29%, and 18%–20% over
an optimised, conventional expansion approach.
Categories and Subject Descriptors
H.3.3 [Information Storage and Retrieval]: Information
Search and Retrieval Query Formulation, Relevance Feed-
back, Search Process; H.3.4 [Information Storage and
Retrieval]: Systems and Software Performance Evaluation
(Efficiency and Effectiveness)
General Terms
Experimentation, Performance
Keywords
Query Expansion, Query Association, Web Search
1. INTRODUCTION
Information retrieval aims to find documents that are rele-
vant to a user’s information need. In web retrieval, the need
is typically expressed as a query consisting of a small num-
ber of words (Spink, Wolfram, Jansen & Saracevic 2001),
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
CIKM’03, November 3–8, 2003, New Orleans, Louisiana, USA.
Copyright 2003 ACM 1-58113-723-0/03/0011 ...$5.00.
and answer documents are chosen based on the statistical
similarity of the query to the individual documents in the
collection. Much research over several decades has led to
development of statistical similarity measures that are rea-
sonably effective at finding answers for even the shortest
queries (Witten, Moffat & Bell 1999). However, addition of
good additional query terms can lead to significant improve-
ments in effectiveness.
A wide range of methods for addition of query terms
have been proposed, from manual techniques such as iter-
ative query development and relevance feedback to auto-
matic techniques such as thesaural expansion, anchor-text
ranking, and query expansion. In query expansion—which
is the focus of this paper—the additional query terms are
extracted from highly ranked documents, on the assumption
that these are likely to be relevant. It has been shown to
be effective on some collections, but results on large collec-
tions of web data have been mixed. For example, in our
work (Billerbeck & Zobel 2003) using the Okapi approach
to ranking, we have found not only that the standard pa-
rameters are inappropriate for the web data, but that, even
with the best parameters (found by tuning to that data and
queries), the performance gains are insignificant.
In this paper we propose an alternative approach to query
expansion. The general approach we consider is that the
source of expansion terms need not be the collection itself,
but could be any document set whose topic coverage is simi-
lar to that of the collection, and may thus suggest additional
query terms.
Specifically, we investigate whether query associations can
be used for query expansion. Given a log containing a large
number of queries, it is straightforward to build a surrogate
for each document in a collection, consisting of the queries
that were a close match to that document. We have shown
in earlier work (Scholer & Williams 2002) that these query
associations can provide a useful document summary; that
is, the queries that match a document are a fair description
of its content. Here, we investigate whether query associa-
tions can play a role in query expansion.
Ranking with query expansion consists of three phases:
ranking the original query, against the collection or a docu-
ment set; extracting additional query terms from the highly
ranked items; then ranking the new query against the collec-
tion. We show that query associations are a highly effective
source of expansion terms. On the TREC-10 data, average
precision rises from 0.158 for optimised full text expansion
to 0.189 for expansion via association, a dramatic relative
improvement of 19.5%.
2. BACKGROUND
This paper explores refinements to automatic query ex-
pansion by considering alternative ways in which candidate
expansion terms can be chosen. In this section, we consider
related background work in the areas of query expansion,
the use of past user queries, and expansion using anchor
text from web documents.
Query Expansion
Relevance feedback was proposed over thirty years ago as
a method for improving the effectiveness of information re-
trieval (Salton & McGill 1983). In this approach, a user
is presented with a list of answers to a query, which the
user would then mark as relevant or irrelevant to the infor-
mation need. It was observed (van Rijsbergen 1979) that
terms closely related to those which successfully discrim-
inate relevant from non-relevant documents are good dis-
criminators themselves. Since it can be assumed that query
terms are useful in favouring relevant documents, expansion
terms that are related to the original query terms should be
useful for ranking. The system could then develop a new
query based on this feedback; experiments showed that the
new query could be evaluated with significantly greater ef-
fectiveness. However, the approach does require that the
user takes the time to assess each document.
More recently, query expansion or QE—also known as
pseudo-relevance feedback or automatic query expansion—
was developed as a variation of relevance feedback (Buckley,
Salton, Allan & Singhal 1994). In this approach, an original
query is run using conventional information retrieval tech-
niques (Arasu, Cho, Garcia-Molina, Paepcke & Raghavan
2001, Baeza-Yates & Ribeiro-Neto 1999, Witten et al. 1999).
Then, related terms are extracted from, for example, the
top 10 documents that are returned in response to the origi-
nal query; the additional terms are selected using statistical
heuristics. The related terms are then added to the original
query, and the expanded query is run again to return a fresh
set of documents, which are returned to the user. Again, ex-
periments showed that the method improves effectiveness.
As an example, a user trying to find information about the
Richmond football team might pose a query such as rich-
mond. After expansion, the query might be richmond club
football afl tigers. The reformulated query does not con-
centrate only on documents that contain the single original
query term, but can also retrieve documents that are on the
same topic as the original query but for some reason do not
name the team.
Variations of QE and relevance feedback involve further
interaction (Leuski 2000), and for example require the user
to rate any additional query terms proposed by the system.
Some of the early web search engines, such as Excite, used
a similar approach. In addition, selected techniques such as
Rocchio expansion—which we discuss later in this section—
are also used in applications such as document categorisa-
tion.
In this work, we have used the expansion method de-
scribed at TREC 8 by Robertson & Walker (2000), which
improves effectiveness on average by about 10%—an impres-
sive improvement, given that the effectiveness of the un-
derlying Okapi BM25 similarity measure is already high.
In this approach, documents are initially ranked using the
Okapi BM25 measure (Robertson & Walker 1999, Robert-
son, Walker, Hancock-Beaulieu, Gull & Lau 1992) applied
to the original query.
The Okapi BM25 measure is as follows:
bm25(q, d) =
X
tq
log
N f
t
+ 0.5
f
t
+ 0.5
«
×
(k
1
+ 1)f
d,t
K + f
d,t
where: q is a query, containing terms t; d is a document; N is
the number of documents in the collection; f
t
is the number
of documents containing term t; K is k
1
((1b)+b×L
d
/AL);
k
1
and b are parameters, set to 1.2 and 0.75; f
d,t
is the
number of occurrences of t in d; and L
d
and AL are the
document length and average document length respectively,
measured in some suitable unit.
A detailed explanation of the Okapi BM25 formulation is
presented by Sparck-Jones, Walker & Robertson (2000). We
have omitted some additional parameters that are not used
in this context; for example, we have assumed that query
terms are not repeated. The first term in the BM25 measure
reduces the impact of query terms that are common in the
collection, and the second favours documents that have a
high density of query terms.
The top R ranked documents provide a pool from which
expansion terms are chosen based on their term selection
value:
T SV
t
=
f
t
N
«
r
t
R
r
t
!
where r
t
is the number of these documents that contain
term t. E terms that have the lowest selection value and are
not included in the original query are appended to form a
new query.
The reformulated query is then used to rank documents,
but instead of using the expansion terms’ Okapi value as
above, the Robertson & Walker (1999) formulation:
1
3
× log
(r
t
+ 0.5)/(R r
t
+ 0.5)
(f
t
r
t
+ 0.5)/(N f
t
R + r
t
+ 0.5)
«
is used. The division by three helps prevent expansion terms
from dominating in the reformulated query. (We tested
varying this factor and confirmed that division by three is
an appropriate choice.)
In most of the Okapi-related expansion experiments that
have been reported, the key parameters are fixed, typically
with R = 10 and E = 25. Sakai & Robertson (2001) have
outlined an alternative approach, and Carpineto, de Mori,
Romano & Bigi (2001) have investigated the effect of alter-
native parameter settings.
There has been a great deal of work on the topic of auto-
matic relevance feedback and related areas. One of the ear-
liest and most influential papers is that of Rocchio (1971),
who demonstrated the use of training from positive and neg-
ative examples of relevance to improve the query formula-
tion. Approaches based on the work of Rocchio continue to
be investigated (Carpineto et al. 2001).
However, query expansion is not always effective. Biller-
beck & Zobel (2003) and Carpineto et al. (2001) have shown
that query expansion using local analysis with fixed param-
eters is not robust. In some collections, such as the TREC
web data, expansion does not appear to work effectively.
The experiments reported later show poor results, for ex-
ample, for an expansion technique that has repeatedly been
found to work on the TREC newswire data.
Past queries
Past queries have been shown to be useful for increasing re-
trieval effectiveness (Fitzpatrick & Dent 1997, Furnas 1985,
Raghavan & Sever 1995). Fitzpatrick & Dent (1997) inves-
tigated the use of past queries to improve automatic query
expansion, by using the results of past queries to form affin-
ity pools, from which expansion terms are then selected. The
process works as follows: for a query that is to be expanded,
up to three past queries that are highly similar to the cur-
rent query are identified. The top 100 documents that were
returned for each of these past queries are then merged,
forming the affinity pool. Candidate expansion terms are
identified by running the original query against this pool.
Individual terms are then selected from the top-ranked doc-
uments using a TF-IDF term-scoring algorithm. Fitzpartick
and Dent demonstrate that this technique improves relative
average precision for the TREC-5 collection by around 15%,
from 21.3% to 24.5%. In our work, we propose a different
approach to use of past queries. We also choose expansion
terms from past queries directly, rather than using them to
construct sets of full text documents from which terms are
then selected.
Query association (Scholer & Williams 2002) is a tech-
nique whereby user queries become associated with a doc-
ument if they share a high statistical similarity with the
document. The association process proceeds as follows: a
query is submitted to a search system, and a similarity score
is calculated (for example, using the Okapi ranking formula
described in Section 2). The query then becomes associated
with the top N documents that are returned. For efficiency,
an upper bound, M, is imposed on the number of queries
that can become associated with a single document. Once a
document has a full set of M associations, the least similar
associated query can be dynamically replaced with a new,
more similar query.
Consider a brief example, where we start with no stored
associations, and association parameter settings of M = 2
and N = 5. A user runs an initial query (q1), “stars on crys-
talline sphere”. This query becomes associated with the top
5 answer documents returned by the search system. Suppose
that a second query (q2), “nicolaus copernicus”, retrieves a
further 5 documents, one of which was also retrieved by the
first query. Then this document now has two associations,
while eight other documents in the collection have one. Now
consider a final query (q3), “geocentric cosmology”, which
as one of its answers also retrieves the document that al-
ready has two associations. If the similarity scores between
queries and the common document are ordered such that
q1 < q2 < q3, then q1 will be replaced with the query q3 as
an association for that document.
Query association was proposed for the creation of doc-
ument summaries, to aid users in judging the relevance of
answers returned by a search system. Scholer & Williams
(2002) report that appropriate parameter settings for this
task are M = 5 and N = 3, leading to small summaries com-
posed of high-quality associations. Keeping the summaries
small was important for reducing cognitive processing costs
for the user. In this work, we use associated queries as a
source of terms for query expansion. It is therefore not im-
perative that the number of associated queries be kept low.
We discuss the choice of parameters further in Section 3.
Anchor text
Craswell et al. examined the effectiveness of document sur-
rogates created from anchor text for finding entry pages to
web sites (Craswell, Hawking & Robertson 2001). In their
work, the text content of hypertext links or anchor tags that
inlink to a document is extracted and compiled into a doc-
ument surrogate. Experiments show that document surro-
gates derived from anchor text are significantly more effec-
tive than full-text retrieval for a page-finding task (Hawking
& Craswell 2001). However, anchor text was not found to
be useful for topic-finding tasks. Therefore, for example,
anchor text could be expected to aid retrieval for queries
such as “richmond football club” but not for queries such as
“kicking footballs”. We examine the use of anchor text as a
source for query expansion terms in Section 3.
3. GENERALISED EXPANSION
In this section, we describe our approach to query expan-
sion and, in particular, focus on the novel use of query asso-
ciations in the expansion process. Our generalised method
for query expansion proceeds as follows: first, a query is
submitted to our search system, and the top R answer doc-
uments are obtained, based on a particular collection. From
this initial retrieval run, it is possible to identify a set of can-
didate expansion terms; these may be based on the top R
documents, or surrogates corresponding to these documents.
Then the top E expansion terms are selected, using Robert-
son and Walker’s term selection value formula (see Sec-
tion 2). Finally, selected terms are appended to the original
query, which is then run against the target text collection.
Within this general framework, if we use a single collection
of documents for all steps, then query expansion is of the
standard form, as for example proposed by Robertson &
Walker (2000). We call this scheme full-full, as steps one
and two of the expansion process are based on the full text
of documents in the collection.
Instead of initially searching or choosing expansion terms
from the full text of a document, another possibility is to use
surrogate documents constructed from query associations.
In this approach, queries are associated with documents as
described earlier, then the set of associated queries for a
document is used to represent the document. These can be
incorporated into the expansion framework in either the first
step (ranking), the second step (term selection), or both. In
detail, these three options are as follows.
1. The original query can be run on the full text collec-
tion, after which the top E expansion terms are se-
lected from the set of queries that have previously be-
come associated with the top R documents returned
from running the original query.
We call this scheme full-assoc, as step one of ex-
pansion is based on the full text of documents in the
collection, step two is based on query associations.
2. Initially rank directly on the surrogates built from as-
sociations, then choose expansion terms from the orig-
inal documents. We call this scheme assoc-full.
3. Rank on the document surrogates built from associ-
ations, then select the E expansion terms from the
top R ranked surrogates. We call this scheme assoc-
assoc, as associations are used for both steps 1 and 2
of expansion.
The use of associations for expansion is attractive for sev-
eral reasons. One is that it means that the additional terms
have already been chosen by users as descriptors of top-
ics. Another is that, in contrast to expanding directly from
queries, there is more evidence of relevance: a surrogate doc-
ument constructed from associations has many more terms
than an individual query. The fact that the queries have
associated with the document means that, in some sense,
the terms have topical relationships with each other.
Unlike other methods that make use of the top-ranked
documents, such as expansion from document summarisa-
tions by Lam-Adesina & Jones (2001), assoc-assoc does
not rely directly on the document collection that is searched.
The second and third variations of using associations (sche-
mes assoc-full and assoc-assoc) do not rely on rank-
ing the documents in the collection to find relevant associ-
ations, but treat the associations themselves as documents
that are ranked and used as sources for expansion terms.
Thus, in contrast to using thesauruses (Mandala, Tokunaga
& Tanaka 1999), the “aboutness” of the individual docu-
ments is captured and made use of.
An alternative way to find candidate terms for query ex-
pansion from past user queries is to treat the individual
queries as documents. We can then source expansion terms
by initially ranking the individual queries, and selecting E
terms from the top R past queries returned. We call this
scheme query-query. Note that, as the individual queries
have no direct relation with any particular full text docu-
ment in the collection (in contrast to the association case
above), it does not make sense to have a full-query or
query-full scheme.
Another source of terms for query expansion that we have
experimented with is anchor text. Inlinks (text from anchor
tags in other documents that point to a document) have
a direct relationship with documents in the collection. We
consider one approach using anchor text, where we select the
E expansion terms from the top R anchor text surrogates,
and then search the surrogates again using the expanded
query. We call this link-link.
Most of the schemes described above have parameters that
need to be determined, in particular R and E. Rather than
make arbitrary choices of values, in most cases we used
the TREC-9 queries and relevance judgements described
below to find good parameter settings, using the average
precision measure, and then report only these settings on
TREC-10. Thus the TREC-9 results are the best possible
for that method on that data, with post hoc tuning, while
the TREC-10 results are blind runs.
4. EXPERIMENTS
In this section, we describe our experimental environment,
discuss the statistical significance tests used to validate our
results, and present the results of our experiments with
query expansion techniques for web collections.
Setup
For our experiments, we used the experimental testbed made
available by the TREC conferences (Harman 1995) for the
evaluation of information retrieval experiments. Our experi-
ments were conducted using the TREC WT10g collection, a
10 gigabyte collection of data crawled from the World Wide
Web in 1997 (Bailey, Craswell & Hawking 2001). The col-
lection was constructed to be representative of the web, and
consists of 1.69 million documents with a high level of in-
terconnectivity, allowing link-based retrieval methods to be
evaluated.
Fifty test queries and corresponding relevance judgements
for this collection were developed as part of each of the Web
tracks at the TREC-9 and TREC-10 conferences (Voorhees
& Harman 2000, Voorhees & Harman 2001). TREC queries
consist of four parts: a query number; a title field (for the
TREC-9 web track, these were taken from search engine
logs (Hawking 2000)); a description of the user’s information
need; and a narrative, giving more detail on what kinds
of documents should be considered relevant or irrelevant.
For our experiments, we only used the title field for the
initial query, as this is most representative of a typical web
information retrieval task.
We use a variant of the Okapi BM25 ranking measure to
obtain our similarity scores (see Section 2 and Robertson &
Walker (1999)). The full text retrieval results that we use
as a baseline are from runs that use no query expansion.
The approach used for creating associations is that de-
scribed in Section 2. The query associations were built us-
ing two logs from the Excite search engine
1
, each taken from
a single day in 1997 and 1999 (Spink, Jansen, Wolfram &
Saracevic (2002) provide a comprehensive analysis of the
properties of these query logs). After filtering the logs to
eliminate profanities, and removing duplicates and punctu-
ation, we were left with 917,455 queries to associate with
the collection. The average number of associations per doc-
ument after processing was 5.4, and just under 25% of doc-
uments in the collection had zero associations. While we
built our associations as a batch job, in a production sys-
tem associations would be made in real time as each query
is submitted to the system; it is unclear what the costs of
this process are, and we plan to investigate this in future
work.
In separate experiments, we established effective settings
for query association (Scholer, Williams & Turpin 2003). We
found that association parameters of M = 19 and N = 39
worked well, that is, each document has a maximum of 19
associated queries and the top 39 documents returned in re-
1
http://www.excite.com/
Type AvP P@10 P@20 P@30 R-P R E
Base 0.1487 0.2714 0.2235 0.2000 0.1710
assoc-assoc 0.1893* 0.3429* 0.2888* 0.2503* 0.2204* 06 17
assoc-full(a) 0.1820* 0.3184** 0.2796* 0.2497* 0.2222* 06 17
assoc-full(b) 0.1618* 0.3041** 0.2510* 0.2231* 0.1969* 98 04
full-full(a) 0.1584 0.2796 0.2571* 0.2333* 0.1809 10 25
query-query 0.1567 0.2755 0.2357* 0.2116** 0.1861* 65 02
full-full(b) 0.1553 0.2857 0.2388 0.2184** 0.1867** 98 04
full-assoc 0.1549 0.2571 0.2276 0.2068 0.1786 06 17
link-link 0.1454** 0.2653 0.2153 0.1905 0.1685 37 02
Table 1: Performance of expansion techniques of TREC-10 queries on the TREC WT10g collection, based
on average precision (AvP), precision at 10 (P@10), precision at 20 (P@20), precision at 30 (P@30), and
R-Precision (R-P). Schemes are ordered by decreasing AvP. Results that show a significant difference from
the baseline using the Wilcoxon signed rank test at the 0.05 and 0.10 levels are indicated by * and **,
respectively.
Type Q1 Median Q3 Variance
assoc-assoc -0.0239 0.0040 0.0814 0.7327
assoc-full(a) -0.0073 0.0116 0.0738 0.4286
assoc-full(b) -0.0119 0.0030 0.0308 0.1201
full-full(a) -0.0265 0.0007 0.0302 0.2633
query-query -0.0150 -0.0000 0.0082 0.1108
full-full(b) -0.0211 -0.0004 0.0177 0.1319
full-assoc -0.0370 -0.0007 0.0306 0.2077
link-link -0.0049 -0.0001 0.0002 0.0410
Table 2: Quartiles and variance of different expansion methods on the TREC WT10g collection (TREC-10).
Each number is the effectiveness relative to the baseline of no expansion.
sponse to a query are associated with that query. This was
determined by creating document surrogates from the as-
sociated queries based on different parameter combinations,
and testing retrieval effectiveness by evaluating searches on
these surrogates. We use these parameter settings for our
results reported below.
For our experiments where expansion terms are chosen
directly from queries (with no association), we also use the
917,455 filtered entries from the Excite 1997 and 1999 Excite
logs.
The anchor text for our experiments was obtained by iden-
tifying anchor tags within the WT10g collection, and collat-
ing the text of each anchor that points in to a particular
document into a surrogate for that document.
Significance
We evaluate the significance of our results using the Wilcox-
on signed rank test. This is a non-parametric procedure used
to test whether there is sufficient evidence that the median
of two probability distributions differ in location. For infor-
mation retrieval experiments, it can be used to test whether
two retrieval runs based, for example, on different query ex-
pansion techniques, differ significantly in performance. As
it takes into account the magnitude and direction of the dif-
ference between paired samples, this test is more powerful
than the sign test (Daniel 1990). Being a non-parametric
test, it is not necessary to make any assumptions about the
underlying probability distribution of the sampled popula-
tion.
The t-test, an alternative, parametric test for paired dif-
ference analysis, assumes that the data is normally distri-
buted. Zobel (1998) analysed the results of retrieval ex-
periments from TREC-5, and concluded that the Wilcoxon
signed rank test is a more reliable test for the evaluation of
retrieval runs.
It is a mistake to claim a significant change in performance
based only on different effectiveness scores (Zobel 1998).
While post-hoc analysis of error rates can give valuable in-
formation about the properties of a collection—see for exam-
ple Voorhees & Buckley (2002), who calculate error rates for
the TREC-3 to TREC-10 data empirically—such thresholds
cannot safely be extended to future runs; as these runs were
not themselves part of the calculation, per-query variability
would not be taken into account. A statistical significance
test, on the other hand, enables conclusions to be drawn
about whether a variation in retrieval technique leads to
consistent performance gains.
Results
In our experiments, we have compared a baseline full text
retrieval run with the expansion variants we have described
in Section 3. Our results are presented in Table 1, which
shows retrieval performance based on five precision metrics:
precision at 10 returned documents (P@10), precision at 20
(P@20), precision at 30 (P@30), average interpolated preci-
sion (AvP), and R-precision (R-P).
In these results, the full-full scheme is the conventional
approach to query expansion. We show results with two
parameter settings: first full-full(a), the parameter set-
tings used by Robertson & Walker (2000) of R = 10 and
E = 25; and, second, full-full(b), the optimal parameters
we found in exhaustive tests (we discuss this further below).
Type AvP P@10 P@20 P@30 R-P R E
Base 0.1895 0.2708 0.2042 0.1806 0.2290
assoc-assoc 0.2231* 0.3104** 0.2323** 0.2132* 0.2398 06 17
query-query 0.1996 0.2958** 0.2094 0.1875 0.2402 65 02
assoc-full(a) 0.1966* 0.2604 0.2115 0.1944 0.2167 06 17
link-link 0.1939 0.2708 0.2177 0.1799 0.2201 37 02
full-full(b) 0.1923 0.2813 0.2042 0.1819 0.2287 98 04
assoc-full(b) 0.1856* 0.2875 0.2229** 0.1924 0.2176 98 04
full-assoc 0.1804 0.2417** 0.2052 0.1910 0.1954* 06 17
full-full(a) 0.1607 0.2729** 0.2083 0.1854 0.1806** 10 25
Table 3: Performance of expansion techniques on the training data (TREC-9). We used these queries to
determine parameter settings and included them for reference. Schemes are again ordered by decreasing
AvP.
Type Q1 Median Q3 Variance
assoc-assoc -0.0057 0.0040 0.0604 0.5814
query-query -0.0022 0.0000 0.0049 0.0688
assoc-full(a) -0.0044 0.0006 0.0458 0.8446
link-link -0.0067 -0.0001 0.0019 0.0938
full-full(b) -0.0093 0.0000 0.0075 0.1272
assoc-full(b) -0.0028 0.0020 0.0295 0.4789
full-assoc -0.0259 -0.0040 0.0197 0.3155
full-full(a) -0.0423 -0.0072 0.0086 0.8487
Table 4: Quartiles and variance of different expansion methods on the training data (TREC-9).
Perhaps surprisingly—but in agreement with recent observa-
tions (Billerbeck & Zobel 2003, Carpineto et al. 2001)—the
full-full schemes do not offer significantly better results
than no expansion, except in the R-P measure for our opti-
mal parameter settings (where the relative improvement is
9%, corresponding to an absolute increase of 0.016).
Our novel association-based schemes are effective for query
expansion. In relative terms, the assoc-assoc scheme is
18%–20% better in all three measures than full-full(a)
expansion (an absolute difference of 0.03–0.05), and 26%–
29% better than the baseline no-expansion case (an abso-
lute difference of 0.04–0.07). The assoc-full schemes are
even more effective in the stringent R-Precision measure; all
results are significant at least at the 0.10 level. We con-
clude that query association is an effective tool in the initial
querying stage prior to expansion; this is a particularly use-
ful result since query associations are compact and can be
efficiently searched, and we plan to quantify this in future
work.
Another perspective on the results is shown in Table 2.
All of the methods improve median performance to approx-
imately the same extent—that is, not at all. At the lower
quartile, all the methods have degraded performance some-
what; the extent of degradation has little relationship to av-
erage effectiveness. At the upper quartile, however, the dif-
ferences between the methods are clear. However, note that
it is often the case that a query improved by one method is
not improved by another.
An example of the behaviour of the assoc-assoc scheme
illustrates its utility over the full-full approach. For the
query “earthquakes” (TREC query 513), the average pre-
cision for the assoc-assoc scheme is 0.1706, compared to
0.1341 for no expansion and 0.1162 for the full-full ap-
proach. This is a direct result of the choice of terms for ex-
pansion. For the assoc-assoc scheme, the expanded query
is large and appears to contain only useful terms:
earthquakes earthquake recent nevada
seismograph tectonic faults perpetual 1812 kobe
magnitude california volcanic activity plates
past motion seismological
In contrast, for the full-full(b) scheme the query is the
more narrow:
earthquakes tectonics earthquake geology
geological
These trends in the success of expansion with assoc-assoc
are consistent with our empirical inspections of other queries.
The query-query, full-assoc, and link-link schemes
offer limited benefit for expansion. Without association to
documents, the queries are ineffective: this is probably due
to the median length of the queries being two words, that
is, the queries have very little content when not grouped
together as associations. The full-assoc scheme is ineffec-
tive for similar reasons: query associations are an excellent
source of expansion terms, but are less effective as document
surrogates in the final step of the retrieval process. The
link-link scheme is significantly worse than no expansion
for the AvP measure; this is perhaps unsurprising, since an-
chor text has been shown to be of utility in home or named
page finding tasks, while the queries we use are topic finding
tasks.
As discussed earlier, we tuned our parameters prior to
our experiments. Specifically, the R and E parameters used
in our experiments were identified through an exhaustive
search on the same collection, but using TREC-9 queries.
The results of this process with the TREC-9 queries are
shown for reference in Tables 3 and 4.
We have tried similar experiments on TREC disks 4 and
5, which consist of newswire and similar data. These were
unsuccessful. The problem appears to be that the query
logs, drawn from the web, are inappropriate for this data:
based on a small sample, it seems that many of the queries
in the log do not have relevant documents. Thus the process
of creating newswire associations from search-engine logs is
unlikely to be successful. Query association based expansion
is therefore only of utility if queries are available that are
appropriate for the collection being searched.
5. CONCLUSIONS
Conventional wisdom has held that query expansion is
an effective technique for information retrieval. However,
recent experiments have contradicted this and shown that
parameter settings that work well for one set of queries may
be ineffective on another. In this paper, we have investigated
alternative techniques for obtaining query expansion terms,
with the aim of identifying techniques that are robust for
different query sets.
We have identified a successful expansion source for web
retrieval. This source is query associations, that is, past
queries that have been stored as document surrogates for
the documents that are statistically similar to the query.
In experiments with almost one million prior query associ-
ations, we found that expanding TREC-10 web track topic
finding queries using query associations and then searching
the full text is 26%–29% more effective than no expansion,
and 18%–20% better than an optimised conventional expan-
sion approach. Moreover, our results are significant under
statistical tests. We conclude that query associations are a
powerful new expansion technique for web retrieval.
We plan to pursue several directions in our future work.
We will investigate the optimal parameters for query asso-
ciation in the context of query expansion; this work uses
parameters that were determined through a surrogate re-
trieval task. We also plan to investigate whether fixed pa-
rameters for query association are appropriate, and whether
all queries should be associated to documents. In addition,
we plan to investigate the efficiency tradeoff between main-
taining associations and the likely efficiency improvement
of searching associations for expansion terms instead of full
text.
Acknowledgements
This research is supported by the Australian Research Coun-
cil and by the State Government of Victoria.
6. REFERENCES
Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A. & Raghavan,
S. (2001), ‘Searching the web’, ACM Transactions on
Internet Technology (TOIT) 1(1), 2–43.
Baeza-Yates, R. & Ribeiro-Neto, B. (1999), Modern Information
Retrieval, Addison-Wesley Longman.
Bailey, P., Craswell, N. & Hawking, D. (2001), ‘Engineering a
multi-purpose test collection for web retrieval experiments’,
Information Processing and Management . In revision.
Available from www.ted.cmis.csiro.au/dave/cwc.ps.gz.
Billerbeck, B. & Zobel, J. (2003), When query expansion fails,
in C. Clarke, G. Cormack, J. Callan, D. Hawking &
A. Smeaton, eds, ‘Proceedings of the ACM-SIGIR
International Conference on Research and Development in
Information Retrieval’, Toronto, Canada, pp. 387–388.
Buckley, C., Salton, G., Allan, J. & Singhal, A. (1994),
Automatic query expansion using SMART: TREC 3, in
D. Harman, ed., ‘Overview of the Third Text REtrieval
Conference (TREC-3)’, NIST Special Publication 500-225,
pp. 69–80.
Carpineto, C., de Mori, R., Romano, G. & Bigi, B. (2001), ‘An
information-theoretic approach to automatic query
expansion’, ACM Transactions on Information Systems
(TOIS) 19(1), 1–27.
Craswell, N., Hawking, D. & Robertson, S. (2001), Effective site
finding using link anchor information, in D. H. Kraft,
W. B. Croft, D. J. Harper & J. Zobel, eds, ‘Proceedings of
the ACM-SIGIR International Conference on Research and
Development in Information Retrieval’, New Orleans, LA,
pp. 250–257.
Daniel, W. (1990), Applied Nonparametric Statistics, 2nd edn,
PWS-KENT Publishing Company.
Fitzpatrick, L. & Dent, M. (1997), Automatic feedback using
past queries: Social searching?, in N. J. Belkin, A. D.
Narasimhalu, P. Willett, W. Hersh, F. Can & E. Voorhees,
eds, ‘Proceedings of the ACM-SIGIR International
Conference on Research and Development in Information
Retrieval’, Philadelphia, PA, pp. 306–313.
Furnas, G. W. (1985), Experience with an adaptive indexing
scheme, in L. Borman & R. Smith, eds, ‘Proceedings of the
ACM-CHI Conference on Human Factors in Computing
Systems’, pp. 131–135.
Harman, D. (1995), ‘Overview of the second text retrieval
conference (TREC-2)’, Information Processing &
Management 31(3), 271–289.
Hawking, D. (2000), Overview of the TREC-9 web track, in
‘The Ninth Text REtrieval Conference (TREC 9)’,
National Institute of Standards and Technology Special
Publication 500-249, Washington, DC, pp. 87–99.
Hawking, D. & Craswell, N. (2001), Overview of the TREC-2001
web track, in E. M. Voorhees & D. K. Harman, eds, ‘The
Tenth Text REtrieval Conference (TREC 2001)’, National
Institute of Standards and Technology Special Publication
500-250, Washington, DC, pp. 61–67.
Lam-Adesina, A. M. & Jones, G. J. F. (2001), Applying
summarization techniques for term selection in relevance
feedback, in D. H. Kraft, W. B. Croft, D. J. Harper &
J. Zobel, eds, ‘Proceedings of the ACM-SIGIR
International Conference on Research and Development in
Information Retrieval’, New Orleans, LA, pp. 1–9.
Leuski, A. (2000), Relevance and reinforcement in interactive
browsing, in A. Agah, J. Callan, E. Rundensteiner &
S. Gauch, eds, ‘Proceedings of the ACM-CIKM
International Conference on Information and Knowledge
Management’, McLean, VA, pp. 119–126.
Mandala, R., Tokunaga, T. & Tanaka, H. (1999), Combining
multiple evidence from different types of thesaurus for
query expansion, in F. Gey, M. Hearst & R. Tong, eds,
‘Proceedings of the ACM-SIGIR International Conference
on Research and Development in Information Retrieval’,
Berkeley, CA.
Raghavan, V. V. & Sever, H. (1995), On the reuse of past
optimal queries, in E. A. Fox, P. Ingwersen & R. Fidel, eds,
‘Proceedings of the ACM-SIGIR International Conference
on Research and Development in Information Retrieval’,
Seattle, WA, pp. 344–350.
Robertson, S. E. & Walker, S. (1999), Okapi/Keenbow at
TREC-8, in E. M. Voorhees & D. K. Harman, eds, ‘The
Eighth Text REtrieval Conference (TREC-8)’, NIST
Special Publication 500-264, Gaithersburg, MD,
pp. 151–161.
Robertson, S. E. & Walker, S. (2000), Microsoft cambridge at
trec-9: Filtering track, in E. M. Voorhees & D. K. Harman,
eds, ‘The Ninth Text REtrieval Conference (TREC-9)’,
NIST Special Publication 500-249, Gaithersburg, MD,
pp. 361–368.
Robertson, S. E., Walker, S., Hancock-Beaulieu, M., Gull, A. &
Lau, M. (1992), Okapi at TREC, in D. K. Harman, ed.,
‘The First Text REtrieval Conference (TREC-1)’, NIST
Special Publication 500-207, Gaithersburg, MD, pp. 21–30.
Rocchio, J. J. (1971), Relevance feedback in information
retrieval, in E. Ide & G. Salton, eds, ‘The Smart Retrieval
System — Experiments in Automatic Document
Processing’, Prentice-Hall, Englewood, Cliffs, New Jersey,
pp. 313–323.
Sakai, T. & Robertson, S. E. (2001), Flexible pseudo-relevance
feedback using optimization tables, in D. H. Kraft, W. B.
Croft, D. J. Harper & J. Zobel, eds, ‘Proceedings of the
ACM-SIGIR International Conference on Research and
Development in Information Retrieval’, New Orleans, LA,
pp. 396–397.
Salton, G. & McGill, M. (1983), Introduction to Modern
Information Retrieval, McGraw-Hill, New York.
Scholer, F. & Williams, H. E. (2002), Query association for
effective retrieval, in C. Nicholas, D. Grossman,
K. Kalpakis, S. Qureshi, H. van Dissel & L. Seligman, eds,
‘Proceedings of the ACM-CIKM International Conference
on Information and Knowledge Management’, McLean, VA,
pp. 324–331.
Scholer, F., Williams, H. & Turpin, A. (2003), Document
surrogates for web search. (Manuscript in submission).
Sparck-Jones, K., Walker, S. & Robertson, S. E. (2000), ‘A
probabilistic model of information retrieval: development
and comparative experiments. Parts 1&2’, Information
Processing and Management 36(6), 779–840.
Spink, A., Jansen, M. B. J., Wolfram, D. & Saracevic, T.
(2002), ‘From e-sex to e-commerce: Web search changes’,
IEEE Computer 35(3), 107–109.
Spink, A., Wolfram, D., Jansen, M. B. J. & Saracevic, T.
(2001), ‘Searching the web: the public and their queries’,
Journal of the American Society for Information Science
and Technology 52(3), 226–234.
van Rijsbergen, C. (1979), Information Retrieval, second edn,
Butterworths.
Voorhees, E. M. & Buckley, C. (2002), The effect of topic set
size on retrieval experiment error, in K. Jrvelin,
M. Beaulieu, R. Baeza-Yates & S. H. Myaeng, eds,
‘Proceedings of the ACM-SIGIR International Conference
on Research and Development in Information Retrieval’,
Tampere, Finland, pp. 316–323.
Voorhees, E. M. & Harman, D. K. (2000), Overview of the Ninth
Text REtrieval Conference (TREC-9), in E. M. Voorhees &
D. K. Harman, eds, ‘The Ninth Text REtrieval Conference
(TREC 9)’, National Institute of Standards and Technology
Special Publication 500-249, Gaithersburg, MD, pp. 1–14.
Voorhees, E. M. & Harman, D. K. (2001), Overview of TREC
2001, in E. M. Voorhees & D. K. Harman, eds, ‘The Tenth
Text REtrieval Conference (TREC 2001)’, National
Institute of Standards and Technology Special Publication
500-250, Gaithersburg, MD, pp. 1–15.
Witten, I. H., Moffat, A. & Bell, T. C. (1999), Managing
Gigabytes: Compressing and Indexing Documents and
Images., 2nd edn, Morgan Kaufman Publishing, San
Francisco.
Zobel, J. (1998), How reliable are the results of large-scale
information retrieval experiments?, in W. B. Croft,
A. Moffat, C. J. van Rijsbergen, R. Wilkinson & J. Zobel,
eds, ‘Proceedings of the ACM-SIGIR International
Conference on Research and Development in Information
Retrieval’, Melbourne, Australia, pp. 307–314.
... It has been created and is maintained by the National Library of Medicine (NLM). It is used to index all the documents contained in MEDLINE, a database of biomedical and life sciences articles, which contains more than 25 million references 18 . MeSH entries are defined with a unique identifier, a short description or definition, links to related descriptors, and a list of synonyms (known as entry terms). ...
... QE has received a great deal of attention for several years in the literature and has been acknowledged as the most successful technique to deal with the vocabulary mismatch issue [25]. Medical search leverages today on the firmer theoretical foundations and a better understanding of the usefulness and limitations of a variety of QE approaches known in IR [13,17,18]. We review in what follows state-of-the art works and then discuss their effectiveness results obtained in similar benchmarks. ...
Article
The explosive growth and widespread accessibility of medical information on the Internet have led to a surge of research activity in a wide range of scientific communities including health informatics and information retrieval (IR). One of the common concerns of this research, across these disciplines, is how to design either clinical decision support systems or medical search engines capable of providing adequate support for both novices (e.g., patients and their next-of-kin) and experts (e.g., physicians, clinicians) tackling complex tasks (e.g., search for diagnosis, search for a treatment). However, despite the significant multi-disciplinary research advances, current medical search systems exhibit low levels of performance. This survey provides an overview of the state of the art in the disciplines of IR and health informatics, and bridging these disciplines shows how semantic search techniques can facilitate medical IR. First,we will give a broad picture of semantic search and medical IR and then highlight the major scientific challenges. Second, focusing on the semantic gap challenge, we will discuss representative state-of-the-art work related to feature-based as well as semantic-based representation and matching models that support medical search systems. In addition to seminal works, we will present recent works that rely on research advancements in deep learning. Third, we make a thorough cross-model analysis and provide some findings and lessons learned. Finally, we discuss some open issues and possible promising directions for future research trends.
... Relevance feedback approaches use documents as a source to add terms to the original query. As an alternative, in experiments using the TREC WT10g collection, Billerbeck et al. [14] show that forming query associations using the method described by Scholer and Williams [77] is more effective than using documents. Scholer and Williams [77] had found that a group of related queries can be formed by finding the set of top scoring queries for a document. ...
... This approach is generalizable to all safe-to-k dynamic pruning traversal strategies; and hence we again test three variations, denoted "SP-CS-a," with a one of VBMW, WAND, and MaxScore. Computing CombSUM over query variations draws an interesting parallel with the Assoc-Assoc approach of Billerbeck et al. [14]. Instead of weighting terms by the Robertson and Walker [75] term-selection value formula, the term weightings are linearly scaled by the frequency of their occurrences in the related query set. ...
Article
Rank fusion is a powerful technique that allows multiple sources of information to be combined into a single result set. Query variations covering the same information need represent one way in which different sources of information might arise. However, when implemented in the obvious manner, fusion over query variations is not cost-effective, at odds with the usual web-search requirement for strict per-query efficiency guarantees. In this work, we propose a novel solution to query fusion by splitting the computation into two parts: one phase that is carried out offline, to generate pre-computed centroid answers for queries addressing broadly similar information needs, and then a second online phase that uses the corresponding topic centroid to compute a result page for each query. To achieve this, we make use of score-based fusion algorithms whose costs can be amortized via the pre-processing step and that can then be efficiently combined during subsequent per-query re-ranking operations. Experimental results using the ClueWeb12B collection and the UQV100 query variations demonstrate that centroid-based approaches allow improved retrieval effectiveness at little or no loss in query throughput or latency and within reasonable pre-processing requirements. We additionally show that queries that do not match any of the pre-computed clusters can be accurately identified and efficiently processed in our proposed ranking pipeline.
... A limited number of expansion terms may also be important to reduce the response time, especially for a large corpus. However, several studies observed that the number of expansion terms is of limited significance and differs from query to query [9][10][11] . ...
Article
In the field of information retrieval, query expansion (QE) has long been used as a technique to deal with the fundamental issue of word mismatch between a user’s query and the target information. In the context of the relationship between the query and expanded terms, existing weighting techniques often fail to appropriately capture the term-term relationship and term to the whole query relationship, resulting in low retrieval effectiveness. Our proposed QE approach addresses this by proposing three weighting models based on (1) tf-idf, (2) k-nearest neighbor (kNN) based cosine similarity, and (3) correlation score. Further, to extract the initial set of expanded terms, we use pseudo-relevant web knowledge consisting of the top N web pages returned by the three popular search engines namely, Google, Bing, and DuckDuckGo, in response to the original query. Among the three weighting models, tf-idf scores each of the individual terms obtained from the web content, kNN-based cosine similarity scores the expansion terms to obtain the term-term relationship, and correlation score weighs the selected expansion terms with respect to the whole query. The proposed model, called web knowledge based query expansion (WKQE), achieves an improvement of 25.89% on the Mean Average Precision (MAP) score and 30.83% on the Geometric Mean Average precision (GMAP) score over the unexpanded queries on the FIRE dataset. A comparative analysis of the WKQE techniques with other related approaches clearly shows significant improvement in the retrieval performance. We have also analyzed the effect of varying the number of pseudo-relevant documents and expansion terms on the retrieval effectiveness of the proposed model.
... Kim et al. (2001) also proposed a novel approach for query expansion which was based on term co-occurrence similarity. A new query expansion method was proposed by Billerbeck et al. (2003). The method was based on associated queries. ...
Article
Full-text available
Nowadays, searching the relevant documents from a large dataset becomes a big challenge. Automatic query expansion is one of the techniques, which addresses this problem by refining the query. A new query expansion approach using cuckoo search and accelerated particle swarm optimization technique is proposed in this paper. The proposed approach mainly focused to find the most relevant expanded query rather than suitable expansion terms. In this paper, Fuzzy logic is also employed, which improves the performance of accelerated particle swarm optimization by controlling various parameters. We have compared the proposed approach with other existing and recently developed automatic query expansion approaches on various evaluating parameters such as average recall, average precision, Mean-Average Precision, F-measure and precision-recall graph. We have evaluated the performance of all approaches on three datasets CISI, CACM and TREC-3. The results obtained for all three datasets depict that the proposed approach gets better results in comparison to other automatic query expansion approaches.
... La première famille a utilisé les relations existantes entre les requêtes et les résultats de recherche afin de générer un contexte supplémentaire. Par exemple, l'extraction des termes à partir des résultats cliqués [Cui et al., 2003] ou la recherche de requêtes associées aux mêmes documents [Billerbeck et al., 2003]. Par contre, la seconde famille considère les requêtes [Audeh, 2014]. ...
Article
This paper offers a comprehensive review of the research on Natural Language Generation (NLG) over the past two decades, especially in relation to data-to-text generation and text-to-text generation deep learning methods, as well as new applications of NLG technology. This survey aims to (a) give the latest synthesis of deep learning research on the NLG core tasks, as well as the architectures adopted in the field; (b) detail meticulously and comprehensively various NLG tasks and datasets, and draw attention to the challenges in NLG evaluation, focusing on different evaluation methods and their relationships; (c) highlight some future emphasis and relatively recent research issues that arise due to the increasing synergy between NLG and other artificial intelligence areas, such as computer vision, text and computational creativity.
Chapter
The quality of retrieval documents in CLIR is often poor compared to IR system due to (1) query mismatching, (2) multiple representations of query terms, and (3) un-translated query terms. The inappropriate translation may lead to poor quality of results. Hence, automated query translation is performed using the back-translation approach for improvement of query translation. This chapter mainly focuses on query expansion (Q.E) and proposes an algorithm to address the drift query issue for Hindi-English CLIR. The system uses FIRE datasets and a set of 50 queries of Hindi language for evaluation. The purpose of a term ordering-based algorithm is to resolve the drift query issue in Q.E. The result shows that the relevancy of Hindi-English CLIR is improved by performing Q.E. using a term ordering-based algorithm. The outcome achieved 60.18% accuracy of results where Q.E has been performed using a term ordering based algorithm, whereas the result of Q.E without a term ordering-based algorithm stands at 57.46%.
Article
User intent analysis is a continuous research hotspot in the field of query expansion. However, the big amount of irrelevant feedbacks in search log has negatively impacted the precision of user intent model. By observing the log, it can be found that tentative click is a major source of irrelevant feedback. It is also observed that a kind of new feedback information can be extracted from the log to recognize the characteristics of tentative clicks. With this new feedback information, this paper proposes an advanced user intent model and applies it into query expansion. Experiment results show that the model can effectively decrease the negative impact of irrelevant feedbacks that belong to tentative clicks and increase the precision of query expansion, especially for those informational queries.
Article
With the ever increasing size of the web, relevant information extraction on the Internet with a query formed by a few keywords has become a big challenge. Query Expansion (QE) plays a crucial role in improving searches on the Internet. Here, the user’s initial query is reformulated by adding additional meaningful terms with similar significance. QE – as part of information retrieval (IR) – has long attracted researchers’ attention. It has become very influential in the field of personalized social document, question answering, cross-language IR, information filtering and multimedia IR. Research in QE has gained further prominence because of IR dedicated conferences such as TREC (Text Information Retrieval Conference) and CLEF (Conference and Labs of the Evaluation Forum). This paper surveys QE techniques in IR from 1960 to 2017 with respect to core techniques, data sources used, weighting and ranking methodologies, user participation and applications – bringing out similarities and differences.
Article
Information retrieval (IR) is the science of identifying documents or sub-documents from a collection of information or database. The collection of information does not necessarily be available in only one language as information does not depend on languages. Monolingual IR is the process of retrieving information in query language whereas cross-lingual information retrieval (CLIR) is the process of retrieving information in a language that differs from query language. In current scenario, there is a strong demand of CLIR system because it allows the user to expand the international scope of searching a relevant document. As compared to monolingual IR, one of the biggest problems of CLIR is poor retrieval performance that occurs due to query mismatching, multiple representations of query terms and untranslated query terms. Query expansion (QE) is the process or technique of adding related terms to the original query for query reformulation. Purpose of QE is to improve the performance and quality of retrieved information in CLIR system. In this paper, QE has been explored for a Hindi–English CLIR in which Hindi queries are used to search English documents. We used Okapi BM25 for documents ranking, and then by using term selection value, translated queries have been expanded. All experiments have been performed using FIRE 2012 dataset. Our result shows that the relevancy of Hindi–English CLIR can be improved by adding the lowest frequency term.
Article
Full-text available
1 Summary Apart from a short description of our Query Track contri- bution, this report is concerned with the Adaptive Filter- ing track only. There is a separate report in this volume (1) on the Microsoft Research Cambridge participation in QA track. A number of runs were submitted for the Adaptive Fil- tering track, on all tasks (adaptive filtering, batch filter- ing and routing; three separate query sets; two evalua- tion measures). The filtering system is somewhat more advanced than the one used for TREC-8, and includes query modification and a more highly developed scheme for threshold adaptation. A number of diagnostic runs are also reported here.
Article
Full-text available
The paper combines a comprehensive account of the probabilistic model of retrieval with new systematic experiments on TREC Programme material. It presents the model from its foundations through its logical development to cover more aspects of retrieval data and a wider range of system functions. Each step in the argument is matched by comparative retrieval tests, to provide a single coherent account of a major line of research. The experiments demonstrate, for a large test collection, that the probabilistic model is effective and robust, and that it responds appropriately, with major improvements in performance, to key features of retrieval situations.Part 1 covers the foundations and the model development for document collection and relevance data, along with the test apparatus. Part 2 covers the further development and elaboration of the model, with extensive testing, and briefly considers other environment conditions and tasks, model training, concluding with comparisons with other approaches and an overall assessment.Data and results tables forboth partsare given in Part 1. Key results are summarised in Part 2.
Article
In studying actual Web searching by the public at large, we analyzed over one million Web queries by users of the Excite search engine. We found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features. A small number of search terms are used with high frequency, and a great many terms are unique; the language of Web queries is distinctive. Queries about recreation and entertainment rank highest. Findings are compared to data from two other large studies of Web queries. This study provides an insight into the public practices and choices in Web searching.