Multi-Dimensional Event Data in Graph Databases

Stefan Esser, Dirk Fahland (Corresponding author)

Research output: Contribution to journalArticleAcademicpeer-review

68 Citations (Scopus)
95 Downloads (Pure)

Abstract

Process event data is usually stored either in a sequential process event log or in a relational database. While the sequential, single-dimensional nature of event logs aids querying for (sub)sequences of events based on temporal relations such as “directly/eventually-follows,” it does not support querying multi-dimensional event data of multiple related entities. Relational databases allow storing multi-dimensional event data, but existing query languages do not support querying for sequences or paths of events in terms of temporal relations. In this paper, we propose a general data model for multi-dimensional event data based on labeled property graphs that allows storing structural and temporal relations in a single, integrated graph-based data structure in a systematic way. We provide semantics for all concepts of our data model, and generic queries for modeling event data over multiple entities that interact synchronously and asynchronously. The queries allow for efficiently converting large real-life event data sets into our data model, and we provide 5 converted data sets for further research. We show that typical and advanced queries for retrieving and aggregating such multi-dimensional event data can be formulated and executed efficiently in the existing query language Cypher, giving rise to several new research questions. Specifically, aggregation queries on our data model enable process mining over multiple inter-related entities using off-the-shelf technology.

Original languageEnglish
Pages (from-to)109–141
Number of pages33
JournalJournal on Data Semantics
Volume10
Issue number1-2
Early online date27 May 2021
DOIs
Publication statusPublished - 1 Jun 2021

Bibliographical note

Publisher Copyright:
© 2021, The Author(s).

Funding

The results of this paper have been greatly influenced and shaped by discussion and inspirations over several years with Wil M.P. van der Aalst, Claudio di Ciccio, Marlon Dumas, Manuel Haug, Martin Klenk, Massimiliano de Leoni, Xixi Lu, Jan Mendling, Marco Montali, Alexander Rinke, and Stefan Schöning. The results would have been impossible without the careful preparation of the public BPI Challenge event data sets by Boudewijn van Dongen in preserving their multi-dimensional nature for this research, and without the regular availability of George Fletcher for introducing us to graph databases and providing insights into this field.

Keywords

  • Event log
  • Graph databases
  • Labeled property graphs
  • Multi-dimensional processes
  • Process mining
  • Querying

Fingerprint

Dive into the research topics of 'Multi-Dimensional Event Data in Graph Databases'. Together they form a unique fingerprint.

Cite this