In this paper we study the discovery of frequent sequences and we aim at extending the non-derivable condensed representation in frequent itemset mining to sequential pattern mining. We start by showing a negative example: in the context of frequent sequences, the notion of non-derivability is meaningless.
This negative result motivated us to look at a slightly different problem: the mining of conjunctions of sequential patterns. This extended class of patterns turns out to have much nicer mathematical properties. For example, for this class of patterns we are able to extend the notion of non-derivable itemsets in a non-trivial way, based on a new unexploited theoretical definition of equivalence classes for sequential patterns. As a side-effect of considering conjunctions of sequences as the pattern type, we can easily form association rules between sequences. We believe that building a theoretical framework and an efficient approach for sequence association rules extraction problem is the first step toward the generalization of association rules to all complex and ordered patterns.
This is an extended abstract of an article published in the Data Mining and Knowledge Discovery journal 
|Titel||Machine Learning and Knowledge Discovery in Databases (European Conference, ECML PKDD 2008, Antwerp, Belgium, September 15-19, 2008, Proceedings, Part I)|
|Redacteuren||W. Daelemans, B. Goethals, K. Morik|
|Plaats van productie||Berlin|
|ISBN van geprinte versie||978-3-540-87478-2|
|Status||Gepubliceerd - 2008|
|Naam||Lecture Notes in Computer Science|
|ISSN van geprinte versie||0302-9743|