Orthogonal Range Queries - California Institute of Technology

Download Report

Transcript Orthogonal Range Queries - California Institute of Technology

Data Structures for
Orthogonal Range Queries
A New Data Structure and Comparison to Previous Work.
Application to Contact Detection in Solid Mechanics.
Sean Mauch
Caltech
April, 2003
Terminology: Orthogonal Range Queries
• An orthogonal range query,
(ORQ), determines the points in
a set which lie within a given
window (bounding box).
• To the right is a set of points
representing the location and
population of cities.
• We do an orthogonal range
query on the cities We show the
window and its projections onto
the coordinate planes.
2
Example Application: Contact Detection
• Algorithm for solid mechanics simulation with contact detection:
– Search for potential contacts between nodes and surfaces. Which
nodes are close to each face?
– Do detailed contact check among potential contacts.
– Enforce contacts by computing force required to remove overlap.
Contact search.
3
Contact check.
Contact Detection Depends on ORQ’s
• Searches for potential contacts are done with orthogonal range queries.
– The capture box contains the nodes that may contact the face.
– An ORQ finds the nodes in the capture box.
• Detecting possible contacts is computationally expensive, typically half
of total solid simulation execution time.
• Good performance depends on choosing an efficient data structure and
search algorithm for the orthogonal range queries.
4
Data Structures for ORQ’s
• Octrees and kd-trees.
– Tried and tested data structures from computational geometry.
• Point in box method, (Swegle, et al.).
– Developed at Sandia National Laboratories for doing parallel contact.
• Cell arrays.
– A bucket sort in 3-D.
– Optimal computational complexity when properly tuned.
– Dense arrays may have a high storage overhead.
• Cell arrays coupled with binary and forward searching.
– New data structure.
5
Kd-trees
• Kd-trees recursively divide the domain by choosing the median in the
chosen coordinate direction.
• Depth is determined only by the number of points.
6
Octrees
• Quadtrees (2-D) and Octrees (3-D) recursively divide space into
quadrants or octants.
• Depth is determined by the number of points and point spacing.
7
Point-in-box Method (Swegle et. al.)
• Sorts and ranks the points in each coordinate direction.
• There are three slices for a window. Choose the slice with the least
points and see if those points are in the window.
• The rank array allows integer instead of floating point comparisons
which is much faster on some processors.
8
Cell Arrays
• The computational domain is spanned by an array of cells. Each cell
contains a set of points.
• Constant time access to cells, but perhaps large storage overhead
because of empty cells.
9
Sparse Cell Arrays
• Array is sparse in one dimension.
• Store only non-empty cells.
• Trade reduced storage for increased cell access time.
10
Cell Arrays Coupled with Searching
•
•
•
•
Cell array does not span one dimension. Produces long, thin cells.
Compared to sparse cell array, we search on records instead of on cells.
Binary search on records is similar to sparse cell array.
If we sort the queries we can do forward searching in each cell.
11
List of Methods
Label
Description
seq. scan
sequential scan
projection
projection
pt-in-box
point-in-box
kd-tree
kd-tree
kd-tree d.
kd-tree with domain checking
octree
octree
cell
cell array
sparse cell sparse cell array
cell b. s.
cell array with binary search
cell f. s.
cell array with forward search
cell f. s. k.
cell array with forward search on keys
12
Parameters
•
•
•
•
•
•
N: The number of records.
K: The records are in K-dimensional space.
M: The number of cells.
I: The number of records in the query.
R: The ratio of the length of a query range to the length of a cell.
Q: The number of queries.
13
Computational Complexity Table
Method
Expected Complexity of ORQ
Storage
seq. scan
N
N
projection
K log N+N1-1/K
KN
pt-in-box
K log N+N1-1/K
(2K+1) N
kd-tree
log N+I
N
kd-tree d.
log N+I
N
octree
log N+I
(D+1) N
cell
(R+1)K+(1+1/R)K I
M+N
sparse cell (R+1)K-1 log(M1/K)+(R+1)K+(1+1/R)K I
M1-1/K+N
cell b. s.
(R+1)K-1 log(N/M1-1/K)+(1+1/R)K-1 I
M1-1/K+N
cell f. s.
(R+1)K-1+N/Q+(1+1/R)K-1 I
M1-1/K+N
cell f. s. k.
(R+1)K-1+N/Q+(1+1/R)K-1 I
M1-1/K+N
14
Test Problem: Randomly Distributed Points
• Records are randomly distributed points in the unit cube.
• Number of records varies from 100 to 1,000,000.
• ORQ on a box containing approximately 10 records is performed around
each point.
15
Execution Time – Random Points
16
Memory Usage – Random Points
17
Conclusions
• Projection methods like Point-In-Box perform poorly.
• The kd-tree does not yield the best performance, but it performs fairly
well over a wide range of problems.
• Cell methods offer the best performance when they are well suited to the
problem at hand.
– Dense cell arrays work well for uniformly distributed data.
– Sparse cell arrays and cell arrays coupled with binary searching offer
good performance for most problems.
– Cell arrays coupled with forward searching offers the best
performance when doing multiple queries.
18