On the Weakenesses of Correlation Measures used for Search Engines' Results (Unsupervised Comparison of Search Engine Rankings)

Source: arXiv


The correlation of the result lists provided by search engines is fundamental
and it has deep and multidisciplinary ramifications. Here, we present automatic
and unsupervised methods to assess whether or not search engines provide
results that are comparable or correlated. We have two main contributions:
First, we provide evidence that for more than 80% of the input queries -
independently of their frequency - the two major search engines share only
three or fewer URLs in their search results, leading to an increasing
divergence. In this scenario (divergence), we show that even the most robust
measures based on comparing lists is useless to apply; that is, the small
contribution by too few common items will infer no confidence. Second, to
overcome this problem, we propose the fist content-based measures - i.e.,
direct comparison of the contents from search results; these measures are based
on the Jaccard ratio and distribution similarity measures (CDF measures). We
show that they are orthogonal to each other (i.e., Jaccard and distribution)
and extend the discriminative power w.r.t. list based measures. Our approach
stems from the real need of comparing search-engine results, it is automatic
from the query selection to the final evaluation and it apply to any
geographical markets, thus designed to scale and to use as first filtering of
query selection (necessary) for supervised methods.

12 Reads
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we present a number of measures that compare rankings of search engine results. We apply these measures to five queries that were monitored daily for two periods of 14 or 21 days each. Rankings of the different search engines (Google, Yahoo! and Teoma for text searches and Google, Yahoo! and Picsearch for image searches) are compared on a daily basis, in addition to longitudinal comparisons of the same engine for the same query over time. The results and rankings of the two periods are compared as well.
    Computer Networks 07/2006; 50(10-50):1448-1463. DOI:10.1016/j.comnet.2005.10.020 · 1.26 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Search engines are among the most useful and popular services on the Web. Users are eager to know how they compare. Which one has the largest coverage? Have they indexed the same portion of the Web? How many pages are out there? Although these questions have been debated in the popular and technical press, no objective evaluation methodology has been proposed and few clear answers have emerged. In this paper we describe a standardized, statistical way of measuring search engine coverage and overlap through random queries. Our technique does not require privileged access to any database. It can be implemented by third-party evaluators using only public query interfaces.We present results from our experiments showing size and overlap estimates for HotBot, AltaVista, Excite, and Infoseek as percentages of their total joint coverage in mid 1997 and in November 1997. Our method does not provide absolute values. However using data from other sources we estimate that as of November 1997 the number of pages indexed by HotBot, AltaVista, Excite, and Infoseek were respectively roughly 77M, 100M, 32M, and 17M and the joint total coverage was 160 million pages. We further conjecture that the size of the static, public Web as of November was over 200 million pages. The most startling finding is that the overlap is very small: less than 1.4% of the total coverage, or about 2.2 million pages were indexed by all four engines.
    Computer Networks and ISDN Systems 04/1998; 30(1-7-30):379-388. DOI:10.1016/S0169-7552(98)00127-5
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Motivated by several applications, we introduce various distance measures between “top k lists.” Some of these distance measures are metrics, while others are not. For each of these latter distance measures, we show that they are “almost ” a metric in the following two seemingly unrelated aspects: (i) they satisfy a relaxed version of the polygonal (hence, triangle) inequality, and (ii) there is a metric with positive constant multiples that bound our measure above and below. This is not a coincidence—we show that these two notions of almost being a metric are same. Based on the second notion, we define two distance measures to be equivalent if they are bounded above and below by constant multiples of each other. We thereby identify a large and robust equivalence class of distance measures. Besides the applications to the task of identifying good notions of (dis-)similarity between two top k lists, our results imply polynomial-time constant-factor approximation algorithms for the rank aggregation problem with respect to a large class of distance measures.
    SIAM Journal on Discrete Mathematics 10/2002; 17(1). DOI:10.1137/S0895480102412856 · 0.65 Impact Factor
Show more


12 Reads
Available from