Expressive power of an algebra for data mining

T. Calders, L.V.S. Lakshmanan, R.T. Ng, J. Paredaens

    Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

    22 Citaten (Scopus)

    Samenvatting

    The relational data model has simple and clear foundations on which significant theoretical and systems research has flourished. By contrast, most research on data mining has focused on algorithmic issues. A major open question is: what's an appropriate foundation for data mining, which can accommodate disparate mining tasks? We address this problem by presenting a database model and an algebra for data mining. The database model is based on the 3W-model introduced by Johnson et al. [2000]. This model relied on black box mining operators. A main contribution of this article is to open up these black boxes, by using generic operators in a data mining algebra. Two key operators in this algebra are regionize, which creates regions (or models) from data tuples, and a restricted form of looping called mining loop. Then the resulting data mining algebra MA is studied and properties concerning expressive power and complexity are established. We present results in three directions: (1) expressiveness of the mining algebra; (2) relations with alternative frameworks, and (3) interactions between regionize and mining loop.
    Originele taal-2Engels
    Pagina's (van-tot)1169-1214
    TijdschriftACM Transactions on Database Systems
    Volume31
    Nummer van het tijdschrift4
    DOI's
    StatusGepubliceerd - 2006

    Vingerafdruk

    Duik in de onderzoeksthema's van 'Expressive power of an algebra for data mining'. Samen vormen ze een unieke vingerafdruk.

    Citeer dit