Deducing bounds on the support of itemsets

T. Calders

Research output: Chapter in Book/Report/Conference proceedingChapterAcademicpeer-review

10 Citations (Scopus)

Abstract

Mining Frequent Itemsets is the core operation of many data mining algorithms. This operation however, is very data intensive and sometimes produces a prohibitively large output. In this paper we give a complete set of rules for deducing tight bounds on the support of an itemset if the supports of all its subsets are known. Based on the derived bounds [l,u] on the support of a candidate itemset I, we can decide not to access the database to count the support of I if l is larger than the support threshold (I will certainly be frequent), or if u is below the threshold (I will certainly fail the frequency test). We can also use the deduction rules to reduce the size of an adequate representation of the collection of frequent sets; all itemsets I with bounds [l,u], where l =u, do not need to be stored explicitly. To assess the usability in practice, we implemented the deduction rules and we present experiments on real-life data sets.
Original languageEnglish
Title of host publicationDatabase Support for Data Mining Applications : Discovering Knowledge with Inductive Queries
EditorsR. Meo, P.L. Lanzi, M. Klemettinen
Place of PublicationBerlin
PublisherSpringer
Pages214-233
ISBN (Print)3-540-22479-3
DOIs
Publication statusPublished - 2004

Publication series

NameLecture Notes in Computer Science
Volume2682
ISSN (Print)0302-9743

Fingerprint Dive into the research topics of 'Deducing bounds on the support of itemsets'. Together they form a unique fingerprint.

  • Cite this

    Calders, T. (2004). Deducing bounds on the support of itemsets. In R. Meo, P. L. Lanzi, & M. Klemettinen (Eds.), Database Support for Data Mining Applications : Discovering Knowledge with Inductive Queries (pp. 214-233). (Lecture Notes in Computer Science; Vol. 2682). Berlin: Springer. https://doi.org/10.1007/b99016