Process mining: on the balance between underfitting and overfitting

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

1 Downloads (Pure)


Process mining techniques attempt to extract non-trivial and useful information from event logs. One aspect of process mining is control-°ow discovery, i.e., automatically constructing a process model (e.g., a Petri net) describing the causal dependencies between activities. One of the essential problems in process mining is that one cannot assume to have seen all possible behavior. At best, one has seen a representative subset. Therefore, classical synthesis techniques are not suitable as they aim at ¯nding a model that is able to exactly reproduce the log. Existing process mining techniques try to avoid such \over¯tting" by generalizing the model to allow for more behavior. This generalization is often driven by the representation language and very crude assumptions about com- pleteness. As a result, parts of the model are \over¯tting" (allow only what has actually been observed) while other parts may be \under¯tting" (allow for much more behavior without strong support for it). This talk will present the main challenges posed by real-life applications of process mining and show that it is possible to balance between over¯tting and under¯tting in a controlled manner.
Original languageEnglish
Title of host publicationProceedings of the Second International Workshop on the Induction of Process Models at ECML PKDD 2008 (IPM 2008), 15 September 2008, Antwerp, Belgium
EditorsW. Bridewell, T. Calders, A.K. Alves de Medeiros, S. Kramer, M. Pechenizkiy, L. Todorovski
Place of PublicationAntwerpen
PublisherUniversity of Antwerp
Publication statusPublished - 2008


Dive into the research topics of 'Process mining: on the balance between underfitting and overfitting'. Together they form a unique fingerprint.

Cite this