Data stream statistics over sliding windows: how to summarize 150 million updates per second on a single node

Grigorios Chrysos, Odysseas Papapetrou, Dionisios Pnevmatikatos, Apostolos Dollas, Minos Garofalakis

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

Uittreksel

Traditional data management systems map information using centralized and static data structures. Modern applications need to process in real time datasets much larger than system memory. To achieve this, they use dynamic entities that are updated with streaming input data over a sliding window. For efficient and high performance processing, approximate sketch synopses of input streams have been proposed as effective means for the summarization of streaming data over large sliding windows with probabilistic accuracy guarantees. This work presents a system-level solution to accelerate the Exponential Count-Min (ECM) sketch algorithm on reconfigurable technology. Different reconfigurable architectures for the sketch structure that correspond to different cost and performance tradeoffs are presented. We map the proposed system-level ECM sketch architectures to a high-end modern HPC platform to achieve guaranteed and best-effort update rates up to 150 and 180 million tuples per second respectively. We compare the performance of the implemented system against the best optimized multi-thread software alternative and show that our scalable full-system accelerators outperform software solutions by 5-7.5x for Virtex6 devices and in excess of 10x for current Ultrascale devices.

Originele taal-2Engels
TitelProceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019
RedacteurenIoannis Sourdis, Christos-Savvas Bouganis, Carlos Alvarez, Leonel Antonio Toledo Diaz, Pedro Valero, Xavier Martorell
Plaats van productiePiscataway
UitgeverijInstitute of Electrical and Electronics Engineers
Pagina's278-285
Aantal pagina's8
ISBN van elektronische versie978-1-7281-4884-7
DOI's
StatusGepubliceerd - sep 2019
Evenement29th International Conferenceon Field-Programmable Logic and Applications, FPL 2019 - Barcelona, Spanje
Duur: 9 sep 201913 sep 2019

Congres

Congres29th International Conferenceon Field-Programmable Logic and Applications, FPL 2019
LandSpanje
StadBarcelona
Periode9/09/1913/09/19

Vingerafdruk

sliding
Statistics
statistics
Reconfigurable architectures
Information management
Particle accelerators
computer programs
Data structures
data management
data structures
threads
management systems
tradeoffs
Data storage equipment
accelerators
platforms
Processing
costs
Costs

Citeer dit

Chrysos, G., Papapetrou, O., Pnevmatikatos, D., Dollas, A., & Garofalakis, M. (2019). Data stream statistics over sliding windows: how to summarize 150 million updates per second on a single node. In I. Sourdis, C-S. Bouganis, C. Alvarez, L. A. Toledo Diaz, P. Valero, & X. Martorell (editors), Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019 (blz. 278-285). [8892241] Piscataway: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/FPL.2019.00052
Chrysos, Grigorios ; Papapetrou, Odysseas ; Pnevmatikatos, Dionisios ; Dollas, Apostolos ; Garofalakis, Minos. / Data stream statistics over sliding windows : how to summarize 150 million updates per second on a single node. Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019. redacteur / Ioannis Sourdis ; Christos-Savvas Bouganis ; Carlos Alvarez ; Leonel Antonio Toledo Diaz ; Pedro Valero ; Xavier Martorell. Piscataway : Institute of Electrical and Electronics Engineers, 2019. blz. 278-285
@inproceedings{913fb1ec75094bce9dac05686ae480f6,
title = "Data stream statistics over sliding windows: how to summarize 150 million updates per second on a single node",
abstract = "Traditional data management systems map information using centralized and static data structures. Modern applications need to process in real time datasets much larger than system memory. To achieve this, they use dynamic entities that are updated with streaming input data over a sliding window. For efficient and high performance processing, approximate sketch synopses of input streams have been proposed as effective means for the summarization of streaming data over large sliding windows with probabilistic accuracy guarantees. This work presents a system-level solution to accelerate the Exponential Count-Min (ECM) sketch algorithm on reconfigurable technology. Different reconfigurable architectures for the sketch structure that correspond to different cost and performance tradeoffs are presented. We map the proposed system-level ECM sketch architectures to a high-end modern HPC platform to achieve guaranteed and best-effort update rates up to 150 and 180 million tuples per second respectively. We compare the performance of the implemented system against the best optimized multi-thread software alternative and show that our scalable full-system accelerators outperform software solutions by 5-7.5x for Virtex6 devices and in excess of 10x for current Ultrascale devices.",
keywords = "ECM sketch, Exponential histogram, Reconfigurable architecture, Reconfigurable computing, Stream processing",
author = "Grigorios Chrysos and Odysseas Papapetrou and Dionisios Pnevmatikatos and Apostolos Dollas and Minos Garofalakis",
year = "2019",
month = "9",
doi = "10.1109/FPL.2019.00052",
language = "English",
pages = "278--285",
editor = "Ioannis Sourdis and Christos-Savvas Bouganis and Carlos Alvarez and {Toledo Diaz}, {Leonel Antonio} and Pedro Valero and Xavier Martorell",
booktitle = "Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019",
publisher = "Institute of Electrical and Electronics Engineers",
address = "United States",

}

Chrysos, G, Papapetrou, O, Pnevmatikatos, D, Dollas, A & Garofalakis, M 2019, Data stream statistics over sliding windows: how to summarize 150 million updates per second on a single node. in I Sourdis, C-S Bouganis, C Alvarez, LA Toledo Diaz, P Valero & X Martorell (redactie), Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019., 8892241, Institute of Electrical and Electronics Engineers, Piscataway, blz. 278-285, 29th International Conferenceon Field-Programmable Logic and Applications, FPL 2019, Barcelona, Spanje, 9/09/19. https://doi.org/10.1109/FPL.2019.00052

Data stream statistics over sliding windows : how to summarize 150 million updates per second on a single node. / Chrysos, Grigorios; Papapetrou, Odysseas; Pnevmatikatos, Dionisios; Dollas, Apostolos; Garofalakis, Minos.

Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019. redactie / Ioannis Sourdis; Christos-Savvas Bouganis; Carlos Alvarez; Leonel Antonio Toledo Diaz; Pedro Valero; Xavier Martorell. Piscataway : Institute of Electrical and Electronics Engineers, 2019. blz. 278-285 8892241.

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

TY - GEN

T1 - Data stream statistics over sliding windows

T2 - how to summarize 150 million updates per second on a single node

AU - Chrysos, Grigorios

AU - Papapetrou, Odysseas

AU - Pnevmatikatos, Dionisios

AU - Dollas, Apostolos

AU - Garofalakis, Minos

PY - 2019/9

Y1 - 2019/9

N2 - Traditional data management systems map information using centralized and static data structures. Modern applications need to process in real time datasets much larger than system memory. To achieve this, they use dynamic entities that are updated with streaming input data over a sliding window. For efficient and high performance processing, approximate sketch synopses of input streams have been proposed as effective means for the summarization of streaming data over large sliding windows with probabilistic accuracy guarantees. This work presents a system-level solution to accelerate the Exponential Count-Min (ECM) sketch algorithm on reconfigurable technology. Different reconfigurable architectures for the sketch structure that correspond to different cost and performance tradeoffs are presented. We map the proposed system-level ECM sketch architectures to a high-end modern HPC platform to achieve guaranteed and best-effort update rates up to 150 and 180 million tuples per second respectively. We compare the performance of the implemented system against the best optimized multi-thread software alternative and show that our scalable full-system accelerators outperform software solutions by 5-7.5x for Virtex6 devices and in excess of 10x for current Ultrascale devices.

AB - Traditional data management systems map information using centralized and static data structures. Modern applications need to process in real time datasets much larger than system memory. To achieve this, they use dynamic entities that are updated with streaming input data over a sliding window. For efficient and high performance processing, approximate sketch synopses of input streams have been proposed as effective means for the summarization of streaming data over large sliding windows with probabilistic accuracy guarantees. This work presents a system-level solution to accelerate the Exponential Count-Min (ECM) sketch algorithm on reconfigurable technology. Different reconfigurable architectures for the sketch structure that correspond to different cost and performance tradeoffs are presented. We map the proposed system-level ECM sketch architectures to a high-end modern HPC platform to achieve guaranteed and best-effort update rates up to 150 and 180 million tuples per second respectively. We compare the performance of the implemented system against the best optimized multi-thread software alternative and show that our scalable full-system accelerators outperform software solutions by 5-7.5x for Virtex6 devices and in excess of 10x for current Ultrascale devices.

KW - ECM sketch

KW - Exponential histogram

KW - Reconfigurable architecture

KW - Reconfigurable computing

KW - Stream processing

UR - http://www.scopus.com/inward/record.url?scp=85075631653&partnerID=8YFLogxK

U2 - 10.1109/FPL.2019.00052

DO - 10.1109/FPL.2019.00052

M3 - Conference contribution

AN - SCOPUS:85075631653

SP - 278

EP - 285

BT - Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019

A2 - Sourdis, Ioannis

A2 - Bouganis, Christos-Savvas

A2 - Alvarez, Carlos

A2 - Toledo Diaz, Leonel Antonio

A2 - Valero, Pedro

A2 - Martorell, Xavier

PB - Institute of Electrical and Electronics Engineers

CY - Piscataway

ER -

Chrysos G, Papapetrou O, Pnevmatikatos D, Dollas A, Garofalakis M. Data stream statistics over sliding windows: how to summarize 150 million updates per second on a single node. In Sourdis I, Bouganis C-S, Alvarez C, Toledo Diaz LA, Valero P, Martorell X, redacteurs, Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019. Piscataway: Institute of Electrical and Electronics Engineers. 2019. blz. 278-285. 8892241 https://doi.org/10.1109/FPL.2019.00052