FAT-CAT : frequent attributes tree based classification

J. Knijf, de

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    11 Citations (Scopus)

    Abstract

    The natural representation of XML data is to use the underlying tree structure of the data. When analyzing these trees we are ensured that no structural information is lost. These tree structures can be efficiently analyzed due to the existence of frequent pattern mining algorithms that works directly on tree structured data. In this work we describe a classification method for XML data based on frequent attribute trees. From these frequent patterns we select so called emerging patterns, and use these as binary features in a decision tree algorithm. The experimental results show that combining emerging attribute tree patterns with standard classification methods, is a promising combination to tackle the classification of XML documents.
    Original languageEnglish
    Title of host publicationComparative Evaluation of XML Information Retrieval Systems (5th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006, Dagstuhl Castle, Germany, December 17-20, 2006, Revised and Selected Papers)
    EditorsN. Fuhr, M. Lalmas, A. Trotman
    Place of PublicationBerlin
    PublisherSpringer
    Pages485-496
    ISBN (Print)978-3-540-73887-9
    DOIs
    Publication statusPublished - 2006

    Publication series

    NameLecture Notes in Computer Science
    Volume4518
    ISSN (Print)0302-9743

    Fingerprint Dive into the research topics of 'FAT-CAT : frequent attributes tree based classification'. Together they form a unique fingerprint.

  • Cite this

    Knijf, de, J. (2006). FAT-CAT : frequent attributes tree based classification. In N. Fuhr, M. Lalmas, & A. Trotman (Eds.), Comparative Evaluation of XML Information Retrieval Systems (5th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006, Dagstuhl Castle, Germany, December 17-20, 2006, Revised and Selected Papers) (pp. 485-496). (Lecture Notes in Computer Science; Vol. 4518). Springer. https://doi.org/10.1007/978-3-540-73888-6_45