N-gram representations for comment filtering

D. Brand, St. Kroon, B. Van Der Merwe, L. Cleophas

    Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

    1 Citaat (Scopus)
    1 Downloads (Pure)

    Samenvatting

    Accurate classifiers for short texts are valuable assets in many applications. Especially in online communities, where users contribute to content in the form of posts and com- ments, an effective way of automatically categorising posts proves highly valuable. This paper investigates the use of N- grams as features for short text classification, and compares it to manual feature design techniques that have been popu- lar in this domain. We find that the N-gram representations greatly outperform manual feature extraction techniques.

    Originele taal-2Engels
    TitelSAICSIT '15 Proceedings of the 2015 Annual Research Conference on South African Institute of Computer Scientists and Information Technologists, 28-30 September 2015, Stellenbosch, South Africa
    Plaats van productieNew York
    UitgeverijAssociation for Computing Machinery, Inc
    Pagina's1-10
    ISBN van geprinte versie9781450336833
    DOI's
    StatusGepubliceerd - 28 sep 2015
    Evenement2015 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists (SAICSIT 2015) - Stellenbosch Institute for Advanced Study (STIAS), Stellenbosch, Zuid-Afrika
    Duur: 28 sep 201530 sep 2015
    http://www.saicsit2015.org/

    Congres

    Congres2015 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists (SAICSIT 2015)
    Verkorte titelSAICSIT 2015
    Land/RegioZuid-Afrika
    StadStellenbosch
    Periode28/09/1530/09/15
    Ander"Knowledge through Technology"
    Internet adres

    Vingerafdruk

    Duik in de onderzoeksthema's van 'N-gram representations for comment filtering'. Samen vormen ze een unieke vingerafdruk.

    Citeer dit