GQR: A General and Efficient Querying Method for Learning to Hash

As an effective solution to the approximate nearest neighbors (ANN) search problem, learning to hash (L2H) is able to learn similarity-preserving hash functions tailored for a given dataset. However, existing L2H research mainly focuses on improving query performance by learning good hash functions, and uses Hamming ranking (HR) as the default querying framework. We showed by analysis and experiments that Hamming distance, the similarity indicator used in HR, is too coarse-grained and thus limits the performance of query processing.

We proposed a new fine-grained similarity indicator, quantization distance (QD), that provides more information about the similarity between a query and the items in a bucket. Based on QD, we then developed a new querying framework, GQR, that avoids ranking all buckets before fetching items and achieves significantly better query performance than HR. Noted that GQR is general to work with a variety of L2H algorithms, and has been paralleled as part of our distributed system LoSHa in the Husky framework.