Word semantic similarity for morphologically rich languages

Kalliopi Zervanou, Elias Iosif, Alexandros Potamianos

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

6 Citaties (Scopus)

Uittreksel

In this work, we investigate the role of morphology on the performance of semantic similarity for morphologically rich languages, such as German and Greek. The challenge in processing languages with richer morphology than English, lies in reducing estimation error while addressing the semantic distortion introduced by a stemmer or a lemmatiser. For this purpose, we propose a methodology for selective stemming, based on a semantic distortion metric. The proposed algorithm is tested on the task of similarity estimation between words using two types of corpus-based similarity metrics: co-occurrence-based and context-based. The performance on morphologically rich languages is boosted by stemming with the context-based metric, unlike English, where the best results are obtained by the co-occurrence-based metric. A key finding is that the estimation error reduction is different when a word is used as a feature, rather than when it is used as a target word.

TaalEngels
TitelProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014
RedacteurenNicoletta Calzolari, Khalid Choukri, Sara Goggi, Thierry Declerck, Joseph Mariani, Bente Maegaard, Asuncion Moreno, Jan Odijk, Helene Mazo, Stelios Piperidis, Hrafn Loftsson
UitgeverijEuropean Language Resources Association (ELRA)
Pagina's1642-1648
Aantal pagina's7
ISBN van elektronische versie9782951740884
StatusGepubliceerd - 2014
Extern gepubliceerdJa
Evenement9th International Conference on Language Resources and Evaluation, LREC 2014 - Reykjavik, IJsland
Duur: 26 mei 201431 mei 2014

Congres

Congres9th International Conference on Language Resources and Evaluation, LREC 2014
LandIJsland
StadReykjavik
Periode26/05/1431/05/14

Vingerafdruk

semantics
language
performance
methodology
Semantic Similarity
Language
Co-occurrence

Trefwoorden

    Citeer dit

    Zervanou, K., Iosif, E., & Potamianos, A. (2014). Word semantic similarity for morphologically rich languages. In N. Calzolari, K. Choukri, S. Goggi, T. Declerck, J. Mariani, B. Maegaard, A. Moreno, J. Odijk, H. Mazo, S. Piperidis, ... H. Loftsson (editors), Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014 (blz. 1642-1648). European Language Resources Association (ELRA).
    Zervanou, Kalliopi ; Iosif, Elias ; Potamianos, Alexandros. / Word semantic similarity for morphologically rich languages. Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. redacteur / Nicoletta Calzolari ; Khalid Choukri ; Sara Goggi ; Thierry Declerck ; Joseph Mariani ; Bente Maegaard ; Asuncion Moreno ; Jan Odijk ; Helene Mazo ; Stelios Piperidis ; Hrafn Loftsson. European Language Resources Association (ELRA), 2014. blz. 1642-1648
    @inproceedings{a3064b50bbd44a9283dfdc01977a5d3d,
    title = "Word semantic similarity for morphologically rich languages",
    abstract = "In this work, we investigate the role of morphology on the performance of semantic similarity for morphologically rich languages, such as German and Greek. The challenge in processing languages with richer morphology than English, lies in reducing estimation error while addressing the semantic distortion introduced by a stemmer or a lemmatiser. For this purpose, we propose a methodology for selective stemming, based on a semantic distortion metric. The proposed algorithm is tested on the task of similarity estimation between words using two types of corpus-based similarity metrics: co-occurrence-based and context-based. The performance on morphologically rich languages is boosted by stemming with the context-based metric, unlike English, where the best results are obtained by the co-occurrence-based metric. A key finding is that the estimation error reduction is different when a word is used as a feature, rather than when it is used as a target word.",
    keywords = "Distributional semantic models, Lexical semantics, Morphologically rich languages, Morphology",
    author = "Kalliopi Zervanou and Elias Iosif and Alexandros Potamianos",
    year = "2014",
    language = "English",
    pages = "1642--1648",
    editor = "Nicoletta Calzolari and Khalid Choukri and Sara Goggi and Thierry Declerck and Joseph Mariani and Bente Maegaard and Asuncion Moreno and Jan Odijk and Helene Mazo and Stelios Piperidis and Hrafn Loftsson",
    booktitle = "Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014",
    publisher = "European Language Resources Association (ELRA)",

    }

    Zervanou, K, Iosif, E & Potamianos, A 2014, Word semantic similarity for morphologically rich languages. in N Calzolari, K Choukri, S Goggi, T Declerck, J Mariani, B Maegaard, A Moreno, J Odijk, H Mazo, S Piperidis & H Loftsson (redactie), Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association (ELRA), blz. 1642-1648, Reykjavik, IJsland, 26/05/14.

    Word semantic similarity for morphologically rich languages. / Zervanou, Kalliopi; Iosif, Elias; Potamianos, Alexandros.

    Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. redactie / Nicoletta Calzolari; Khalid Choukri; Sara Goggi; Thierry Declerck; Joseph Mariani; Bente Maegaard; Asuncion Moreno; Jan Odijk; Helene Mazo; Stelios Piperidis; Hrafn Loftsson. European Language Resources Association (ELRA), 2014. blz. 1642-1648.

    Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

    TY - GEN

    T1 - Word semantic similarity for morphologically rich languages

    AU - Zervanou,Kalliopi

    AU - Iosif,Elias

    AU - Potamianos,Alexandros

    PY - 2014

    Y1 - 2014

    N2 - In this work, we investigate the role of morphology on the performance of semantic similarity for morphologically rich languages, such as German and Greek. The challenge in processing languages with richer morphology than English, lies in reducing estimation error while addressing the semantic distortion introduced by a stemmer or a lemmatiser. For this purpose, we propose a methodology for selective stemming, based on a semantic distortion metric. The proposed algorithm is tested on the task of similarity estimation between words using two types of corpus-based similarity metrics: co-occurrence-based and context-based. The performance on morphologically rich languages is boosted by stemming with the context-based metric, unlike English, where the best results are obtained by the co-occurrence-based metric. A key finding is that the estimation error reduction is different when a word is used as a feature, rather than when it is used as a target word.

    AB - In this work, we investigate the role of morphology on the performance of semantic similarity for morphologically rich languages, such as German and Greek. The challenge in processing languages with richer morphology than English, lies in reducing estimation error while addressing the semantic distortion introduced by a stemmer or a lemmatiser. For this purpose, we propose a methodology for selective stemming, based on a semantic distortion metric. The proposed algorithm is tested on the task of similarity estimation between words using two types of corpus-based similarity metrics: co-occurrence-based and context-based. The performance on morphologically rich languages is boosted by stemming with the context-based metric, unlike English, where the best results are obtained by the co-occurrence-based metric. A key finding is that the estimation error reduction is different when a word is used as a feature, rather than when it is used as a target word.

    KW - Distributional semantic models

    KW - Lexical semantics

    KW - Morphologically rich languages

    KW - Morphology

    UR - http://www.scopus.com/inward/record.url?scp=85026322678&partnerID=8YFLogxK

    M3 - Conference contribution

    SP - 1642

    EP - 1648

    BT - Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014

    PB - European Language Resources Association (ELRA)

    ER -

    Zervanou K, Iosif E, Potamianos A. Word semantic similarity for morphologically rich languages. In Calzolari N, Choukri K, Goggi S, Declerck T, Mariani J, Maegaard B, Moreno A, Odijk J, Mazo H, Piperidis S, Loftsson H, redacteurs, Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014. European Language Resources Association (ELRA). 2014. blz. 1642-1648.