Accelerating Nearest Neighbor Search on Manycore Systems
We develop methods for accelerating metric similarity search that are effective on modern hardware. Our algorithms factor into easily parallelizable components, making them simple to deploy and efficient on multicore CPUs and GPUs. Despite the simple structure of our algorithms, their search performance is provably sublinear in the size of the database, with a factor dependent only on its intrinsic dimensionality. We demonstrate that our methods provide substantial speedups on a range of datasets and hardware platforms. In particular, we present results on a 48-core server machine, on graphics hardware, and on a multicore desktop.
💡 Research Summary
The paper presents a set of algorithms designed to accelerate metric‑based similarity search—specifically nearest‑neighbor (NN) queries—on modern many‑core hardware such as multicore CPUs and GPUs. The authors observe that traditional NN structures (kd‑trees, ball‑trees, LSH, FLANN, HNSW) suffer from two fundamental problems when faced with large, high‑dimensional datasets: (1) the “curse of dimensionality,” which makes the search cost grow close to linear in the number of points, and (2) irregular memory access patterns that prevent efficient utilization of cache hierarchies and SIMD units. To address these issues, the paper introduces a two‑stage framework: a partition‑and‑sort preprocessing step followed by a highly parallel query engine.
In the preprocessing stage the database is divided into K clusters using a fast, approximate clustering method (e.g., k‑means++ or a simple random projection). Within each cluster the points are sorted according to their distance from the cluster centroid, producing a monotonic list. For each cluster the algorithm also records a tight distance interval
Comments & Academic Discussion
Loading comments...
Leave a Comment