TY - GEN

T1 - Graph-based time-space trade-offs for approximate near neighbors

AU - Laarhoven, Thijs

PY - 2018/5/24

Y1 - 2018/5/24

N2 - We take a first step towards a rigorous asymptotic analysis of graph-based methods for finding (approximate) nearest neighbors in high-dimensional spaces, by analyzing the complexity of randomized greedy walks on the approximate nearest neighbor graph. For random data sets of size n = 2o(d) on the d-dimensional Euclidean unit sphere, using near neighbor graphs we can provably solve the approximate nearest neighbor problem with approximation factor c > 1 in query time nρq+o(1) and space n1+ρs+o(1), for arbitrary ρq,ρs ≥ 0 satisfying (2c2 - 1)ρq + 2c2(c2 - 1)√ρs(1 - ρs) ≥ c4. (1) Graph-based near neighbor searching is especially competitive with hash-based methods for small c and near-linear memory, and in this regime the asymptotic scaling of a greedy graph-based search matches optimal hash-based trade-offs of Andoni-Laarhoven-Razenshteyn-Waingarten [5]. We further study how the trade-offs scale when the data set is of size n = 2Θ(d), and analyze asymptotic complexities when applying these results to lattice sieving.

AB - We take a first step towards a rigorous asymptotic analysis of graph-based methods for finding (approximate) nearest neighbors in high-dimensional spaces, by analyzing the complexity of randomized greedy walks on the approximate nearest neighbor graph. For random data sets of size n = 2o(d) on the d-dimensional Euclidean unit sphere, using near neighbor graphs we can provably solve the approximate nearest neighbor problem with approximation factor c > 1 in query time nρq+o(1) and space n1+ρs+o(1), for arbitrary ρq,ρs ≥ 0 satisfying (2c2 - 1)ρq + 2c2(c2 - 1)√ρs(1 - ρs) ≥ c4. (1) Graph-based near neighbor searching is especially competitive with hash-based methods for small c and near-linear memory, and in this regime the asymptotic scaling of a greedy graph-based search matches optimal hash-based trade-offs of Andoni-Laarhoven-Razenshteyn-Waingarten [5]. We further study how the trade-offs scale when the data set is of size n = 2Θ(d), and analyze asymptotic complexities when applying these results to lattice sieving.

KW - Approximate nearest neighbor problem

KW - Locality-sensitive filters

KW - Locality-sensitive hashing

KW - Near neighbor graphs

KW - Similarity search

UR - http://www.scopus.com/inward/record.url?scp=85048990742&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.SoCG.2018.57

DO - 10.4230/LIPIcs.SoCG.2018.57

M3 - Conference contribution

AN - SCOPUS:85048990742

SN - 978-3-95977-066-8

VL - 99

T3 - Leibniz International Proceedings in Informatics (LIPIcs)

SP - 1

EP - 14

BT - 34th International Symposium on Computational Geometry, SoCG 2018

A2 - Speckmann, Bettina

A2 - Toth, Csaba

PB - Schloss Dagstuhl - Leibniz-Zentrum für Informatik

CY - Dagstuhl

T2 - 34th International Symposium on Computational Geometry (SoCG 2018)

Y2 - 11 June 2018 through 14 June 2018

ER -