Multi-Dimensional Event Data in Graph Databases

Stefan Esser, Dirk Fahland (Corresponding author)

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

69 Citaten (Scopus)
95 Downloads (Pure)

Samenvatting

Process event data is usually stored either in a sequential process event log or in a relational database. While the sequential, single-dimensional nature of event logs aids querying for (sub)sequences of events based on temporal relations such as “directly/eventually-follows,” it does not support querying multi-dimensional event data of multiple related entities. Relational databases allow storing multi-dimensional event data, but existing query languages do not support querying for sequences or paths of events in terms of temporal relations. In this paper, we propose a general data model for multi-dimensional event data based on labeled property graphs that allows storing structural and temporal relations in a single, integrated graph-based data structure in a systematic way. We provide semantics for all concepts of our data model, and generic queries for modeling event data over multiple entities that interact synchronously and asynchronously. The queries allow for efficiently converting large real-life event data sets into our data model, and we provide 5 converted data sets for further research. We show that typical and advanced queries for retrieving and aggregating such multi-dimensional event data can be formulated and executed efficiently in the existing query language Cypher, giving rise to several new research questions. Specifically, aggregation queries on our data model enable process mining over multiple inter-related entities using off-the-shelf technology.

Originele taal-2Engels
Pagina's (van-tot)109–141
Aantal pagina's33
TijdschriftJournal on Data Semantics
Volume10
Nummer van het tijdschrift1-2
Vroegere onlinedatum27 mei 2021
DOI's
StatusGepubliceerd - 1 jun. 2021

Bibliografische nota

Publisher Copyright:
© 2021, The Author(s).

Financiering

The results of this paper have been greatly influenced and shaped by discussion and inspirations over several years with Wil M.P. van der Aalst, Claudio di Ciccio, Marlon Dumas, Manuel Haug, Martin Klenk, Massimiliano de Leoni, Xixi Lu, Jan Mendling, Marco Montali, Alexander Rinke, and Stefan Schöning. The results would have been impossible without the careful preparation of the public BPI Challenge event data sets by Boudewijn van Dongen in preserving their multi-dimensional nature for this research, and without the regular availability of George Fletcher for introducing us to graph databases and providing insights into this field.

Vingerafdruk

Duik in de onderzoeksthema's van 'Multi-Dimensional Event Data in Graph Databases'. Samen vormen ze een unieke vingerafdruk.

Citeer dit