Polytopes, lattices, and spherical codes for the nearest neighbor problem

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

Samenvatting

We study locality-sensitive hash methods for the nearest neighbor problem for the angular distance, focusing on the approach of first projecting down onto a random low-dimensional subspace, and then partitioning the projected vectors according to the Voronoi cells induced by a well-chosen spherical code. This approach generalizes and interpolates between the fast but asymptotically suboptimal hyperplane hashing of Charikar [STOC 2002], and asymptotically optimal but practically often slower hash families of e.g. Andoni-Indyk [FOCS 2006], Andoni-Indyk-Nguyen-Razenshteyn [SODA 2014] and Andoni-Indyk-Laarhoven-Razenshteyn-Schmidt [NIPS 2015]. We set up a framework for analyzing the performance of any spherical code in this context, and we provide results for various codes appearing in the literature, such as those related to regular polytopes and root lattices. Similar to hyperplane hashing, and unlike e.g. cross-polytope hashing, our analysis of collision probabilities and query exponents is exact and does not hide any order terms which vanish only for large d, thus facilitating an easier parameter selection in practical applications. For the two-dimensional case, we analytically derive closed-form expressions for arbitrary spherical codes, and we show that the equilateral triangle is optimal, achieving a better performance than the two-dimensional analogues of hyperplane and cross-polytope hashing. In three and four dimensions, we numerically find that the tetrahedron and 5-cell (the 3-simplex and 4-simplex) and the 16-cell (the 4-orthoplex) achieve the best query exponents, while in five or more dimensions orthoplices appear to outperform regular simplices, as well as the root lattice families Ak and Dk in terms of minimizing the query exponent. We provide lower bounds based on spherical caps, and we predict that in higher dimensions, larger spherical codes exist which outperform orthoplices in terms of the query exponent, and we argue why using the Dk root lattices will likely lead to better results in practice as well (compared to using cross-polytopes), due to a better trade-off between the asymptotic query exponent and the concrete costs of hashing.

Originele taal-2Engels
Titel47th International Colloquium on Automata, Languages, and Programming, ICALP 2020
RedacteurenArtur Czumaj, Anuj Dawar, Emanuela Merelli
UitgeverijSchloss Dagstuhl - Leibniz-Zentrum für Informatik
ISBN van elektronische versie9783959771382
DOI's
StatusGepubliceerd - 1 jun 2020
Evenement47th International Colloquium on Automata, Languages, and Programming, ICALP 2020 - Virtual, Online, Duitsland
Duur: 8 jul 202011 jul 2020

Publicatie series

NaamLeibniz International Proceedings in Informatics, LIPIcs
Volume168
ISSN van geprinte versie1868-8969

Congres

Congres47th International Colloquium on Automata, Languages, and Programming, ICALP 2020
LandDuitsland
StadVirtual, Online
Periode8/07/2011/07/20

Vingerafdruk Duik in de onderzoeksthema's van 'Polytopes, lattices, and spherical codes for the nearest neighbor problem'. Samen vormen ze een unieke vingerafdruk.

Citeer dit