|
Reverse Nearest Neighbor Search in Metric Spaces |
|
IEEE Transactions on Knowledge and Data Engineering (TKDE), 18(9): 1239-1252, 2006 |
| Abstract |
|
|
| Important dates |
| Submitted in May 2005, revised in Dec 2005, accepted in April 2006. |
| Paper download |
![]() |
| Implementation and datasets |
| Before you proceed with downloading, please read and agree to the terms of using our implementation. Download our source codes (implemented: by Man Lung Yiu). Datasets used in our experiments: SF, TS, Color, Signature. Dataset description: SF contains points representing 174k locations in San Francisco. The second dataset TS is a time series containing 76k values corresponding to Dow Jones indexes on 19k consecutive days (i.e., 4 values per day). We create 5D points by placing a sliding window with a 5-value length on all positions in the series, and taking the values covered by each window as the coordinates of a data point. The third dataset Color involves 4D vectors6 representing the color histograms of 68k images. We also generate a Signature dataset, where each object is a string with 65 English letters. We first obtain 20 “anchor signatures”, whose letters are randomly chosen from the alphabet. Then, each anchor produces a cluster with 2.5k objects (resulting in the total cardinality 50k), each of which is obtained by randomly changing x positions in the corresponding anchor signature to other random letters, where x is uniformly distributed in range [1, 18]. File format: Each dataset is in binary
format. The byte ordering conforms to
little endian. The size of an "integer/float" is 4 bytes and that of a
"char" is 1 byte. The first 8 bytes (2 integers) constitute the header: the
first number gives the dataset cardinality, while the 2nd indicates the
storage size of an object (in bytes). Then, object data are stored
sequentially after the headers with this format. |