Boolean Bounding Predicates for Spatial Access Methods

Megan Thomas and Joseph M. Hellerstein

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-02-1174
March 2002

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2002/CSD-02-1174.pdf

Tree-based multidimensional indexes are integral to efficient querying in multimedia and GIS applications. These indexes frequently use shapes in internal tree nodes to describe the data stored in a subtree below. We show that the standard Minimum Bounding Rectangle descriptor can lead to significant inefficiency during tree traversal, due to false positives. We also observe that there is often space in internal nodes for richer, more accurate descriptors than rectangles. We propose exploiting this free space to form subtree predicates based on simple boolean combinations of standard descriptors such as rectangles. Since the problem of choosing these boolean bounding predicates is NP-complete, we implemented and tested several heuristics for tuning the bounding predicates on an index node, and several heuristics for deciding which nodes in the index to improve when available tuning time is limited. We present experiments over a variety of real and synthetic data sets, examining the performance benefit of the various tuning heuristics. Our experiments show that up to 30% or more of the total I/Os in a query workload can be eliminated using the boolean bounding predicates chosen by our algorithms.


BibTeX citation:

@techreport{Thomas:CSD-02-1174,
    Author = {Thomas, Megan and Hellerstein, Joseph M.},
    Title = {Boolean Bounding Predicates for Spatial Access Methods},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2002},
    Month = {Mar},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2002/5462.html},
    Number = {UCB/CSD-02-1174},
    Abstract = {Tree-based multidimensional indexes are integral to efficient querying in multimedia and GIS applications. These indexes frequently use shapes in internal tree nodes to describe the data stored in a subtree below. We show that the standard Minimum Bounding Rectangle descriptor can lead to significant inefficiency during tree traversal, due to false positives. We also observe that there is often space in internal nodes for richer, more accurate descriptors than rectangles. We propose exploiting this free space to form subtree predicates based on simple boolean combinations of standard descriptors such as rectangles. Since the problem of choosing these boolean bounding predicates is NP-complete, we implemented and tested several heuristics for tuning the bounding predicates on an index node, and several heuristics for deciding which nodes in the index to improve when available tuning time is limited. We present experiments over a variety of real and synthetic data sets, examining the performance benefit of the various tuning heuristics. Our experiments show that up to 30% or more of the total I/Os in a query workload can be eliminated using the boolean bounding predicates chosen by our algorithms.}
}

EndNote citation:

%0 Report
%A Thomas, Megan
%A Hellerstein, Joseph M.
%T Boolean Bounding Predicates for Spatial Access Methods
%I EECS Department, University of California, Berkeley
%D 2002
%@ UCB/CSD-02-1174
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2002/5462.html
%F Thomas:CSD-02-1174