Efficient pattern mining of uncertain data with sampling

T. Calders, C. Garboni, B. Goethals

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    56 Citations (Scopus)

    Abstract

    Mining frequent itemsets from transactional datasets is a well known problem with good algorithmic solutions. In the case of uncertain data, however, several new techniques have been proposed. Unfortunately, these proposals often suffer when a lot of items occur with many different probabilities. Here we propose an approach based on sampling by instantiating "possible worlds" of the uncertain data, on which we subsequently run optimized frequent itemset mining algorithms. As such we gain efficiency at a surprisingly low loss in accuracy. These is confirmed by a statistical and an empirical evaluation on real and synthetic data.
    Original languageEnglish
    Title of host publicationAdvances in Knowledge Discovery and Data Mining (14th Pacific-Asia Conference, PAKDD 2010, Hyderabad, India, June 21-24, 2010. Proceedings, Part I)
    EditorsM.J. Zaki, J.X. Yu, B. Ravindran, V. Pudi
    Place of PublicationBerlin
    PublisherSpringer
    Pages480-487
    ISBN (Print)978-3-642-13656-6
    DOIs
    Publication statusPublished - 2010

    Publication series

    NameLecture Notes in Computer Science
    Volume6118
    ISSN (Print)0302-9743

    Fingerprint

    Dive into the research topics of 'Efficient pattern mining of uncertain data with sampling'. Together they form a unique fingerprint.

    Cite this