Near-Neighbor Query Performance in Search Trees

Shuanhu Wang, Joseph M. Hellerstein and Ilya Lipkind

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-98-1012
1998

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1998/CSD-98-1012.pdf

Near-neighbor search is an increasingly important operation for queries over multimedia, text, and other non-standard datatypes. In large databases, near-neighbor searches must be enhanced by indexed retrieval for efficiency. In this paper, we present a detailed analysis of three proposals for near-neighbor search: one based on the R-tree, and two which motivated the invention of new trees, namely the SS-tree and SR-tree. We find that while the new trees do improve performance, the reason for their improvement comes mostly from a new Penalty metric, and not from a variety of other details in their implementation. Our analysis was done using a Generalized Search Tree, which both allowed us to easily do a fair comparison, and also provided the framework for a clearer analysis of the issues at hand.


BibTeX citation:

@techreport{Wang:CSD-98-1012,
    Author = {Wang, Shuanhu and Hellerstein, Joseph M. and Lipkind, Ilya},
    Title = {Near-Neighbor Query Performance in Search Trees},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {1998},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1998/5796.html},
    Number = {UCB/CSD-98-1012},
    Abstract = {Near-neighbor search is an increasingly important operation for queries over multimedia, text, and other non-standard datatypes. In large databases, near-neighbor searches must be enhanced by indexed retrieval for efficiency. In this paper, we present a detailed analysis of three proposals for near-neighbor search: one based on the R-tree, and two which motivated the invention of new trees, namely the SS-tree and SR-tree. We find that while the new trees do improve performance, the reason for their improvement comes mostly from a new Penalty metric, and not from a variety of other details in their implementation. Our analysis was done using a Generalized Search Tree, which both allowed us to easily do a fair comparison, and also provided the framework for a clearer analysis of the issues at hand.}
}

EndNote citation:

%0 Report
%A Wang, Shuanhu
%A Hellerstein, Joseph M.
%A Lipkind, Ilya
%T Near-Neighbor Query Performance in Search Trees
%I EECS Department, University of California, Berkeley
%D 1998
%@ UCB/CSD-98-1012
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1998/5796.html
%F Wang:CSD-98-1012