Song, Yitong, Kai Wang, Bin Yao, Zhida Chen, Jiong Xie, and Feifei Li. "Efficient Reverse $ k $ Approximate Nearest Neighbor Search Over High-Dimensional Vectors." In 2024 IEEE 40th International Conference on Data Engineering (ICDE), pp. 4262-4274. IEEE, 2024
2024年11月04日

【Abstract】  Reverse k nearest neighbor search (\mathbf{R}k\mathbf{NNS}) plays an important role in various data processing and analysis tasks, seeking to pinpoint data considering the query data q among their k nearest neighbors. As large models gain popularity, processing high-dimensional vectors has become more and more widespread. However, existing \mathbf{R}k\mathbf{NNS} solutions face inefficiency when handling large-scale high-dimensional vectors due to their sensitivity to data dimensions and sizes during index construction or the verification of numerous candidate results in the query phase. Motivated by these challenges and the inherent intricacies of high-dimensional data processing, in this paper, we study an approximate version of the \mathbf{R}k\mathbf{NNS} problem (\mathbf{R}k\mathbf{ANNS}) for high-dimensional vectors, aiming to offer efficient and practical solutions. To this end, we propose a new proximity-graph-based index called HAMG, which enables finding the query results within k hops from q . We also present a user-friendly query algorithm on HAMG that can adaptively adjust the search scope based on the desired query recall of users. To further enhance the query process, two pruning strategies are proposed to reduce the number of candidates requiring verification. Extensive experiments validate that HAMG scales well for data dimensions and sizes, and our query algorithm improves query efficiency by up to two orders of magnitude while maintaining comparable query accuracy against existing approaches.