|
Quality and Efficiency in High Dimensional Nearest Neighbor Search |
|
In ACM Conference on Management of Data (SIGMOD), 2009 |
| Abstract |
|
Motivated by this, we propose a new access method called
the locality sensitive B-tree (LSB-tree) that enables fast
high-dimensional NN search with excellent quality. The combination
of several LSB-trees leads to a structure called the LSB-forest that ensures
the same result quality as rigorous-LSH, but reduces its space and query
cost dramatically. The LSB-forest also outperforms adhoc-LSH, even though
the latter has no quality guarantee. Besides its appealing theoretical
properties, the LSB-tree itself also serves as an effective index that
consumes linear space, and supports efficient updates. Our extensive
experiments confirm that the LSB-tree is faster than (i) the state of the art
of exact NN search by two orders of magnitude, and (ii) the best
(linear-space) method of approximate retrieval by an order of magnitude, and
at the same time, returns neighbors with much better quality. |
| Paper download |
|
| Implementation and datasets |
Before downloading, please read the release notes. Codes. LSB [coded by Yufei Tao under visual C++ 6.0][released on 1 Sep. 2009] This implementation is for the long version of the paper, which is under submission for journal publication Binary. lsb.exe Datasets. color (version 2), mnist. |