Towards distributed model analytics with apache spark

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

2 Citaten (Scopus)
1 Downloads (Pure)

Samenvatting

The growing number of models and other related artefacts in model-driven engineering has recently led to the emergence of approaches and tools for analyzing and managing them on a large scale. The framework SAMOS applies techniques inspired by information retrieval and data mining to analyze large sets of models. As the data size and analysis complexity goes up, however, further scalability is needed. In this paper we extend SAMOS to operate on Apache Spark, a popular engine for distributed Big Data processing, by partitioning the data and parallelizing the comparison and analysis phase. We present preliminary studies using a cluster infrastructure and report the results for two datasets: one with 250 Ecore metamodels where we detail the performance gain with various settings, and a larger one of 7.3k metamodels with nearly one million model elements for further demonstrating scalability.

Originele taal-2Engels
TitelMODELSWARD 2018 - Proceedings of the 6th International Conference on Model-Driven Engineering and Software Development
RedacteurenSlimane Hammoudi, Luis Ferreira Pires, Bran Selic
UitgeverijSCITEPRESS-Science and Technology Publications, Lda.
Pagina's767-772
Aantal pagina's6
ISBN van elektronische versie978-989-758-283-7
DOI's
StatusGepubliceerd - 1 jan 2018
Evenement6th International Conference on Model-Driven Engineering and Software Development, MODELSWARD 2018 - Funchal, Madeira, Portugal
Duur: 22 jan 201824 jan 2018

Congres

Congres6th International Conference on Model-Driven Engineering and Software Development, MODELSWARD 2018
LandPortugal
StadFunchal, Madeira
Periode22/01/1824/01/18

Vingerafdruk Duik in de onderzoeksthema's van 'Towards distributed model analytics with apache spark'. Samen vormen ze een unieke vingerafdruk.

Citeer dit