Recent developments in pattern mining

T.G.K. Calders

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    402 Downloads (Pure)

    Abstract

    Pattern Mining is one of the most researched topics in the data mining community. Literally hundreds of algorithms for efficiently enumerating all frequent itemsets have been proposed. These exhaustive algorithms, however, all suffer from the pattern explosion problem. Depending on the minimal support threshold, even for moderately sized databases, millions of patterns may be generated. Although this problem is by now well recognized in te pattern mining community, it has not yet been solved satisfactorily. In my talk I will give an overview of the different approaches that have been proposed to alleviate this problem. As a first step, constraint-based mining and condensed representations such as the closed itemsets and the non-derivable itemsets were introduced. These methods, however, still produce too many and redundant results. More recently, promising methods based upon the minimal description length principle, information theory, and statistical models have been introduced. We show the respective advantages and disadvantages of these approaches and their connections, and illustrate their usefulness on real life data. After this overview we move from itemsets to more complex patterns, such as sequences and graphs. Even though these extensions seem trivial at first, they turn out to be quite challenging. I will end my talk with an overview of what I consider to be important open questions in this fascinating research area.
    Original languageEnglish
    Title of host publicationDiscovery Science (15th International Conference, DS 2012, Lyon, France, October 29-31, 2012. Proceedings)
    EditorsJ.-G. Ganascia, Ph. Lenca, J.-M. Petit
    Place of PublicationBerlin
    PublisherSpringer
    Pages2-2
    ISBN (Print)978-3-642-33491-7
    DOIs
    Publication statusPublished - 2012
    Eventconference; 15th International Conference on Discovery Science; 2012-10-29; 2012-10-31 -
    Duration: 29 Oct 201231 Oct 2012

    Publication series

    NameLecture Notes in Computer Science
    Volume7569
    ISSN (Print)0302-9743

    Conference

    Conferenceconference; 15th International Conference on Discovery Science; 2012-10-29; 2012-10-31
    Period29/10/1231/10/12
    Other15th International Conference on Discovery Science

    Fingerprint

    Dive into the research topics of 'Recent developments in pattern mining'. Together they form a unique fingerprint.

    Cite this