Easy spark

Y. van den Wildenberg, Wouter Nuijten, O. Papapetrou

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Today's data deluge calls for novel, scalable data handling and processing solutions. Spark has emerged as a popular distributed in-memory computing engine for processing and analysing a large amount of data in parallel. However, the way parallel processing pipelines are designed is fundamentally different from traditional programming techniques, and hence most programmers are either unable to start using Spark, or are not utilising Spark to the maximum of its potential. This study describes an easier entry point into Spark. We design and implement a GUI that allows any programmer with knowledge of a standard programming language (e.g., Python or Java) to write Spark applications effortlessly and interactively, and to submit and execute them to large clusters.

Original languageEnglish
Title of host publicationProceedings of the Workshops of the EDBT/ICDT 2021 Joint Conference, Nicosia, Cyprus, March 23, 2021
EditorsConstantinos Costa, Evaggelika Pitoura
PublisherCEUR-WS.org
Number of pages6
Publication statusPublished - 2021
Event2021 Workshops of the EDBT/ICDT Joint Conference, EDBT/ICDT-WS 2021 - Nicosia, Cyprus
Duration: 23 Mar 2021 → …

Publication series

NameCEUR Workshop Proceedings
PublisherCEUR-WS.org
Volume2841
ISSN (Print)1613-0073

Conference

Conference2021 Workshops of the EDBT/ICDT Joint Conference, EDBT/ICDT-WS 2021
Country/TerritoryCyprus
CityNicosia
Period23/03/21 → …

Bibliographical note

Funding Information:
This work was partially funded by the EU H2020 project Smart-DataLake (825041).

Publisher Copyright:
© 2021 Copyright for this paper by its author(s).

Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.

Fingerprint

Dive into the research topics of 'Easy spark'. Together they form a unique fingerprint.

Cite this