Alignment-based trace clustering

T. Chatain, J. Carmona, B.F. van Dongen

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

8 Citations (Scopus)

Abstract

A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.

Original languageEnglish
Title of host publicationConceptual Modeling - 36th International Conference, ER 2017, Proceedings
EditorsH.C. Mayr, G. Guizzardi, H. Ma, O. Pastor
PublisherSpringer
Pages295-308
Number of pages14
ISBN (Print)9783319699035
DOIs
Publication statusPublished - 2017
Event36th International Conference on Conceptual Modeling, (ER2017) - Universitat Politecnica de Valencia, Valencia, Spain
Duration: 6 Nov 20179 Nov 2017
Conference number: 36
http://er2017.pros.webs.upv.es/

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10650 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349
NameInformation Systems and Applications, series LNISA
Volume10650

Conference

Conference36th International Conference on Conceptual Modeling, (ER2017)
Abbreviated titleER 2017
CountrySpain
CityValencia
Period6/11/179/11/17
Internet address

Fingerprint

Alignment
Trace
Clustering
Concept Drift
Incompleteness
Vector spaces
Driving Force
Centroid
Model
Open Source
Process Model
Vector space
Encoding

Cite this

Chatain, T., Carmona, J., & van Dongen, B. F. (2017). Alignment-based trace clustering. In H. C. Mayr, G. Guizzardi, H. Ma, & O. Pastor (Eds.), Conceptual Modeling - 36th International Conference, ER 2017, Proceedings (pp. 295-308). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10650 LNCS), (Information Systems and Applications, series LNISA; Vol. 10650). Springer. https://doi.org/10.1007/978-3-319-69904-2_24
Chatain, T. ; Carmona, J. ; van Dongen, B.F. / Alignment-based trace clustering. Conceptual Modeling - 36th International Conference, ER 2017, Proceedings. editor / H.C. Mayr ; G. Guizzardi ; H. Ma ; O. Pastor . Springer, 2017. pp. 295-308 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). (Information Systems and Applications, series LNISA).
@inproceedings{3eb10ea1aa4e4b64a06ddbced650bc90,
title = "Alignment-based trace clustering",
abstract = "A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.",
author = "T. Chatain and J. Carmona and {van Dongen}, B.F.",
year = "2017",
doi = "10.1007/978-3-319-69904-2_24",
language = "English",
isbn = "9783319699035",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
pages = "295--308",
editor = "H.C. Mayr and G. Guizzardi and H. Ma and {Pastor }, O.",
booktitle = "Conceptual Modeling - 36th International Conference, ER 2017, Proceedings",
address = "Germany",

}

Chatain, T, Carmona, J & van Dongen, BF 2017, Alignment-based trace clustering. in HC Mayr, G Guizzardi, H Ma & O Pastor (eds), Conceptual Modeling - 36th International Conference, ER 2017, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10650 LNCS, Information Systems and Applications, series LNISA, vol. 10650, Springer, pp. 295-308, 36th International Conference on Conceptual Modeling, (ER2017), Valencia, Spain, 6/11/17. https://doi.org/10.1007/978-3-319-69904-2_24

Alignment-based trace clustering. / Chatain, T.; Carmona, J.; van Dongen, B.F.

Conceptual Modeling - 36th International Conference, ER 2017, Proceedings. ed. / H.C. Mayr; G. Guizzardi; H. Ma; O. Pastor . Springer, 2017. p. 295-308 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10650 LNCS), (Information Systems and Applications, series LNISA; Vol. 10650).

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Alignment-based trace clustering

AU - Chatain, T.

AU - Carmona, J.

AU - van Dongen, B.F.

PY - 2017

Y1 - 2017

N2 - A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.

AB - A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.

UR - http://www.scopus.com/inward/record.url?scp=85033445854&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-69904-2_24

DO - 10.1007/978-3-319-69904-2_24

M3 - Conference contribution

AN - SCOPUS:85033445854

SN - 9783319699035

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 295

EP - 308

BT - Conceptual Modeling - 36th International Conference, ER 2017, Proceedings

A2 - Mayr, H.C.

A2 - Guizzardi, G.

A2 - Ma, H.

A2 - Pastor , O.

PB - Springer

ER -

Chatain T, Carmona J, van Dongen BF. Alignment-based trace clustering. In Mayr HC, Guizzardi G, Ma H, Pastor O, editors, Conceptual Modeling - 36th International Conference, ER 2017, Proceedings. Springer. 2017. p. 295-308. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). (Information Systems and Applications, series LNISA). https://doi.org/10.1007/978-3-319-69904-2_24