Nearest Neighbor Queries for R-Trees: Why Not Bottom-Up?
ABSTRACT Given a query point q, finding the nearest neighbor (NN) object is one of the most important problem in computer science. In this paper, a bottom-up
search algorithm for processing NN query in R-trees is presented. An additional data structure, hash, is introduced to increase
the pruning capability of the proposed algorithm. Based on hash, whole data space is disjointly partitioned into n × n cells. Each cell contains the pointers of leaf nodes which intersect with the cell. The experiment shows that the proposed
approach outperforms the existing NN search algorithms including the BFS algorithm which is known as I/O optimal algorithm.
- SourceAvailable from: uni-muenchen.de[show abstract] [hide abstract]
ABSTRACT: The R-tree, one of the most popular access methods for rectangles, is based on the heuristic optimization of the area of the enclosing rectangle in each inner node. By running numerous experiments in a standardized testbed under highly varying data, queries and operations, we were able to design the R*-tree which incorporates a combined optimization of area, margin and overlap of each enclosing rectangle in the directory. Using our standardized testbed in an exhaustive performance comparison, it turned out that the R*-tree clearly outperforms the existing R-tree variants. Guttman's linear and quadratic R-tree and Greene's variant of the R-tree. This superiority of the R*-tree holds for different types of queries and operations, such as map overlay, for both rectangles and multidimensional points in all experiments. From a practical point of view the R*-tree is very attractive because of the following two reasons 1 it efficiently supports point and spatial data at the same time and 2 its implementation cost is only slightly higher than that of other R-trees.01/1990;
- [show abstract] [hide abstract]
ABSTRACT: In this paper we present an analytical model that predicts the performance of R-trees (and its variants) when a range query needs to be answered. The cost model uses knowledge of the dataset only i.e., the proposed formula that estimates the number of disk accesses is a function of data properties, namely, the amount of data and their density in the work space. In other words, the proposed model is applicable even before the construction of the R-tree index, a fact that makes it a useful tool for dynamic spatial databases. Several experiments on synthetic and real datasets show that the proposed analytical model is very accurate, the relative error being usually around 10-15%, for uniform and non-uniform distributions. We consider that this error represents the gap between the, currently, most efficient R-tree implementation (R*-tree) and the theoretically optimum one.12/2000;
- [show abstract] [hide abstract]
ABSTRACT: During the last decade, multimedia databases have become increasingly important in many application areas such as medicine, CAD, geography, or molecular biology. An important research issue in the field of multimedia databases is the content based retrieval of similar multimedia objects such as images, text, and videos. However, in contrast to searching data in a relational database, a content based retrieval requires the search of similar objects as a basic functionality of the database system. Most of the approaches addressing similarity search use a so-called feature transformation which transforms important properties of the multimedia objects into high-dimensional points (feature vectors). Thus, the similarity search is transformed into a search of points in the feature space which are close to a given query point in the high-dimensional feature space. Query Processing in high-dimensional spaces has therefore been a very active research area over the last few years. A number of new index structures and algorithms have been proposed. It has been shown that the new index structures considerably improve the performance in querying large multimedia databases. Based on recent tutorials [BK 98, BK 00], in this survey we provide an overview of the current state-of-the-art in querying multimedia databases, describing the index structures and algorithms for an efficient query processing in high-dimensional spaces. We identify the problems of processing queries in high-dimensional space, and we provide an overview of the proposed approaches to overcome these problems.First publ. in: ACM computing surveys 33 (2001), 3, pp. 322-373. 01/2001;