Classification of historical notary acts with noisy labels

I. Efremova, A. Montes Garcia, T.G.K. Calders

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    4 Citations (Scopus)

    Abstract

    This paper approaches the problem of automatic classification of real-world historical notary acts from the 14th to the 20th century. We deal with category ambiguity, noisy labels and imbalanced data. Our goal is to assign an appropriate category for each notary act from the archive collection. We investigate a variety of existing techniques and describe a framework for dealing with noisy labels which includes category resolution, evaluation of inter-annotator agreement and the application of a two level classification. The maximum accuracy we achieve is 88%, which is comparable to the agreement between human annotators.
    Original languageEnglish
    Title of host publicationAdvances in Information Retrieval (37th European Conference on IR Research, ECIR 2015, Vienna, Austria, March 29-April 2, 2015. Proceedings)
    EditorsA. Hanbury, G. Kazai, A. Rauber, N. Fuhr
    Place of PublicationBerlin
    PublisherSpringer
    Pages49-54
    ISBN (Print)978-3-319-16353-6
    DOIs
    Publication statusPublished - 2015
    Event37th European Conference on Information Retrieval (ECIR 2015), March 29-April 2, 2015, Vienna, Austria - Vienna, Austria
    Duration: 29 Mar 20152 Apr 2015

    Publication series

    NameLecture Notes in Computer Science
    Volume9022
    ISSN (Print)0302-9743

    Conference

    Conference37th European Conference on Information Retrieval (ECIR 2015), March 29-April 2, 2015, Vienna, Austria
    Abbreviated titleECIR 2015
    CountryAustria
    CityVienna
    Period29/03/152/04/15

    Fingerprint Dive into the research topics of 'Classification of historical notary acts with noisy labels'. Together they form a unique fingerprint.

    Cite this