abstract = "We take a first step towards a rigorous asymptotic analysis of graph-based methods for finding (approximate) nearest neighbors in high-dimensional spaces, by analyzing the complexity of randomized greedy walks on the approximate nearest neighbor graph. For random data sets of size n = 2 o(d) on the d-dimensional Euclidean unit sphere, using near neighbor graphs we can provably solve the approximate nearest neighbor problem with approximation factor c > 1 in query time n ρq+o(1) and space n 1+ρs+o(1), for arbitrary ρ q,ρ s ≥ 0 satisfying (2c 2 - 1)ρq + 2c 2(c 2 - 1)√ρ s(1 - ρ s) ≥ c 4. (1) Graph-based near neighbor searching is especially competitive with hash-based methods for small c and near-linear memory, and in this regime the asymptotic scaling of a greedy graph-based search matches optimal hash-based trade-offs of Andoni-Laarhoven-Razenshteyn-Waingarten [5]. We further study how the trade-offs scale when the data set is of size n = 2 Θ(d), and analyze asymptotic complexities when applying these results to lattice sieving. ",

keywords = "Approximate nearest neighbor problem, Locality-sensitive filters, Locality-sensitive hashing, Near neighbor graphs, Similarity search",

