Abstract
In today’s data-driven landscape, geospatial streams are pivotal in diverse fields, ranging from sociology to network engineering
and to meteorology. A key challenge in utilizing these streams is to efficiently compute aggregates over ad-hoc spatial ranges,
possibly with additional predicates on the stream items. For each application scenario, different aggregates become relevant, such
as the number of distinct items, the frequency of each item, or even the variance of the frequencies of the items that fall within
a spatial range.
Storing the entire stream for computing these aggregates is impractical in scenarios that involve fast-paced and unbounded
streams, due to prohibitive storage costs and query execution delays. To address this, we propose two sketches, SpatialSketch
and DynSketch, that support aggregate queries with different types of aggregates. Both sketches require small space, and they
can summarize fast-paced streams and estimate the aggregates, with accuracy guarantees. Importantly, they support new diverse
functionalities, in a plug-and-play manner, without requiring novel theoretical analysis. In addition to the theoretical contribution,
we evaluate SpatialSketch and DynSketch experimentally. Our experiments demonstrate that the two sketches outperform the
state of the art, and that they can be used for addressing novel functionalities for which there exist no small-space solutions to
date.
and to meteorology. A key challenge in utilizing these streams is to efficiently compute aggregates over ad-hoc spatial ranges,
possibly with additional predicates on the stream items. For each application scenario, different aggregates become relevant, such
as the number of distinct items, the frequency of each item, or even the variance of the frequencies of the items that fall within
a spatial range.
Storing the entire stream for computing these aggregates is impractical in scenarios that involve fast-paced and unbounded
streams, due to prohibitive storage costs and query execution delays. To address this, we propose two sketches, SpatialSketch
and DynSketch, that support aggregate queries with different types of aggregates. Both sketches require small space, and they
can summarize fast-paced streams and estimate the aggregates, with accuracy guarantees. Importantly, they support new diverse
functionalities, in a plug-and-play manner, without requiring novel theoretical analysis. In addition to the theoretical contribution,
we evaluate SpatialSketch and DynSketch experimentally. Our experiments demonstrate that the two sketches outperform the
state of the art, and that they can be used for addressing novel functionalities for which there exist no small-space solutions to
date.
Original language | English |
---|---|
Title of host publication | Proceedings 28th International Conference on Extending Database Technology, Proceedings, EDBT 2025 |
Subtitle of host publication | Barcelona, Spain, March 25-March 28 |
Publisher | OpenProceedings.org |
Pages | 284-296 |
Number of pages | 13 |
ISBN (Electronic) | 978-3-89318-098-1 |
DOIs | |
Publication status | Published - 11 Nov 2024 |
Event | 28th International Conference on Extending Database Technology, EDBT 2025 - Barcelona, Spain Duration: 25 Mar 2025 → 28 Mar 2025 |
Publication series
Name | Advances in Database Technology |
---|---|
Number | 2 |
Volume | 28 |
ISSN (Electronic) | 2367-2005 |
Conference
Conference | 28th International Conference on Extending Database Technology, EDBT 2025 |
---|---|
Abbreviated title | EDBT 2025 |
Country/Territory | Spain |
City | Barcelona |
Period | 25/03/25 → 28/03/25 |
Funding
This work was partially funded by the European Commission under the STELAR (HORIZON-EUROPE - Grant No. 101070122) project.
Keywords
- databases
- Streaming data
- Sketches
- approximation algorithm
- spatial analysis