A parallel and distributed approach for diversified top-k best region search

Hamid Shahrivari, Matthaios Olma, Odysseas Papapetrou, Dimitrios Skoutas, Anastasia Ailamaki

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Given a set of points, the Best Region Search problem finds the optimal location of a rectangle of a specified size such that the value of a user-defined scoring function over its enclosed points is maximized. A recently proposed top-k algorithm for this problem returns results progressively, while also incorporating additional constraints, such as taking into consideration the overlap between the set of selected top-k rectangles. However, the algorithm is designed for a centralized setting and does not scale to very large datasets. In this paper, we overcome this limitation by enabling parallel and distributed computation of the results. We first propose a strategy that employs multiple rounds to progressively collect partial top-k results from each node in the cluster, while a coordinator handles the aggregation of the global top-k list, dealing with overlapping results. We then devise a single-round strategy, where the algorithm executed by each node is enhanced with additional conditions that anticipate potential overlapping solutions from neighboring nodes. Additional optimizations are proposed to further increase performance. Our experiments on real-world datasets indicate that our proposed algorithms are efficient and scale to millions of points.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT 2020
Subtitle of host publication23rd International Conference on Extending Database Technology, Proceedings
EditorsAngela Bonifati, Yongluan Zhou, Marcos Antonio Vaz Salles, Alexander Bohm, Dan Olteanu, George Fletcher, Arijit Khan, Bin Yang
PublisherOpenProceedings.org
Pages265-276
Number of pages12
ISBN (Electronic)9783893180837
DOIs
Publication statusPublished - 1 Jan 2020
Event23rd International Conference on Extending Database Technology, EDBT 2020 - Copenhagen, Denmark
Duration: 30 Mar 20202 Apr 2020

Conference

Conference23rd International Conference on Extending Database Technology, EDBT 2020
CountryDenmark
CityCopenhagen
Period30/03/202/04/20

Fingerprint Dive into the research topics of 'A parallel and distributed approach for diversified top-k best region search'. Together they form a unique fingerprint.

  • Cite this

    Shahrivari, H., Olma, M., Papapetrou, O., Skoutas, D., & Ailamaki, A. (2020). A parallel and distributed approach for diversified top-k best region search. In A. Bonifati, Y. Zhou, M. A. Vaz Salles, A. Bohm, D. Olteanu, G. Fletcher, A. Khan, & B. Yang (Eds.), Advances in Database Technology - EDBT 2020: 23rd International Conference on Extending Database Technology, Proceedings (pp. 265-276). OpenProceedings.org. https://doi.org/10.5441/002/edbt.2020.24