A parallel and distributed approach for diversified top-k best region search

Hamid Shahrivari, Matthaios Olma, Odysseas Papapetrou, Dimitrios Skoutas, Anastasia Ailamaki

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

Samenvatting

Given a set of points, the Best Region Search problem finds the optimal location of a rectangle of a specified size such that the value of a user-defined scoring function over its enclosed points is maximized. A recently proposed top-k algorithm for this problem returns results progressively, while also incorporating additional constraints, such as taking into consideration the overlap between the set of selected top-k rectangles. However, the algorithm is designed for a centralized setting and does not scale to very large datasets. In this paper, we overcome this limitation by enabling parallel and distributed computation of the results. We first propose a strategy that employs multiple rounds to progressively collect partial top-k results from each node in the cluster, while a coordinator handles the aggregation of the global top-k list, dealing with overlapping results. We then devise a single-round strategy, where the algorithm executed by each node is enhanced with additional conditions that anticipate potential overlapping solutions from neighboring nodes. Additional optimizations are proposed to further increase performance. Our experiments on real-world datasets indicate that our proposed algorithms are efficient and scale to millions of points.

Originele taal-2Engels
TitelAdvances in Database Technology - EDBT 2020
Subtitel23rd International Conference on Extending Database Technology, Proceedings
RedacteurenAngela Bonifati, Yongluan Zhou, Marcos Antonio Vaz Salles, Alexander Bohm, Dan Olteanu, George Fletcher, Arijit Khan, Bin Yang
UitgeverijOpenProceedings.org
Pagina's265-276
Aantal pagina's12
ISBN van elektronische versie9783893180837
DOI's
StatusGepubliceerd - 1 jan 2020
Evenement23rd International Conference on Extending Database Technology, EDBT 2020 - Copenhagen, Denemarken
Duur: 30 mrt 20202 apr 2020

Congres

Congres23rd International Conference on Extending Database Technology, EDBT 2020
LandDenemarken
StadCopenhagen
Periode30/03/202/04/20

Vingerafdruk Duik in de onderzoeksthema's van 'A parallel and distributed approach for diversified top-k best region search'. Samen vormen ze een unieke vingerafdruk.

  • Citeer dit

    Shahrivari, H., Olma, M., Papapetrou, O., Skoutas, D., & Ailamaki, A. (2020). A parallel and distributed approach for diversified top-k best region search. In A. Bonifati, Y. Zhou, M. A. Vaz Salles, A. Bohm, D. Olteanu, G. Fletcher, A. Khan, & B. Yang (editors), Advances in Database Technology - EDBT 2020: 23rd International Conference on Extending Database Technology, Proceedings (blz. 265-276). OpenProceedings.org. https://doi.org/10.5441/002/edbt.2020.24