Geographic ranking for a local search engine
DOI: 10.1145/1277741.1277979 Conference: SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, July 23-27, 2007
Traditional ranking schemes of the relevance of a Web page to a user query in a search engine are less appropriate when the search term contains geographic information. Often, geographic entities, such as addresses, city names, and location names, appear only once or twice in a Web page, and are typically not in a heading or larger font. Consequently, an alternative ranking approach to the traditional weighted tf*idf relevance ranking is need. Further, if a Web site contains a geographic entity, it is often the case that its in- and out-neighbours do not refer to the same entity, although they may refer to other geographic entities. We present a local search engine that applies a novel ranking algorithm suitable for ranking Web pages with geographic content. We describe its major components: geographic ranking, focused crawling, geographic extractor, and the related web-sites feature.
Available from: Arnd Christian König
- "Lane al. , most recently and closest to our work, used external data (including weather information) to improve relevance in mobile search. Geographic Information Retrieval considers, for instance, rankings of documents that take into account geographic references – the problem addressed in  . Queries, though, may not contain an explicit geographic reference (e.g., a city name) but have a " geo intent " nevertheless. "
[Show abstract] [Hide abstract]
ABSTRACT: The signals used for ranking in local search are very different from web search: in addition to (textual) relevance, measures of (geographic) distance between the user and the search result, as well as measures of popularity of the result are important for effective ranking. Depending on the query and search result, different ways to quantify these factors exist -- for example, it is possible to use customer ratings to quantify the popularity of restaurants, whereas different measures are more appropriate for other types of businesses. Hence, our approach is to capture the different notions of distance/popularity relevant via a number of external data sources (e.g., logs of customer ratings, driving-direction requests, or site accesses). In this paper we will describe the relevant signal contained in a number of such data sources in detail and present methods to integrate these external data sources into the feature generation for local search ranking. In particular, we propose novel backoff methods to alleviate the impact of skew, noise or incomplete data in these logs in a systematic manner. We evaluate our techniques on both human-judged relevance data as well as click-through data from a commercial local search engine.
Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, July 25-29, 2011; 01/2011
[Show abstract] [Hide abstract]
ABSTRACT: Vertical search engines enable users to find information related to a certain topic. A local search engine is a vertical search engine whose topic revolves around a certain geographical area (such as a city, state, country, etc...) In this paper we describe our experiences developing a crawler for a local search engine for the city of Bellingham, Washington, USA. We focus on the tasks of crawling and indexing a large amount of highly relevant Web pages, and then demonstrate ways in which our search engine has the capability to outperform an industrial search engine.
The Fourth International Conference on Digital Society, ICDS 2010, 10.16 February 2010, St. Maarten, Netherlands Antilles; 01/2010
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.