Error detection in spoken human-machine interaction

E.J. Krahmer, M.G.J. Swerts, M. Theune, M.F. Weegels

    Research output: Contribution to journalArticleAcademicpeer-review

    25 Citations (Scopus)

    Abstract

    Given the state of the art of current language and speech technology, errors are unavoidable in present-day spoken dialogue systems. Therefore, one of the main concerns in dialogue design is how to decide whether or not the system has understood the user correctly. In human-human communication, dialogue participants are continuously sending and receiving signals on the status of the information being exchanged. We claim that if spoken dialogue systems were able to detect such cues and change their strategy accordingly, the interaction between user and system would improve. The goals of the present study are therefore twofold: (i) to find out which positive and negative cues people actually use in human-machine interaction in response to explicit and implicit verification questions and how informative these signals are, and (ii) to explore the possibilities of spotting errors automatically and on-line. To reach these goals, we first perform a descriptive analysis, followed by experiments with memory-based machine learning techniques. It appears that people systematically use negative/marked cues when there are communication problems. The experiments using memory-based machine learning techniques suggest that it may be possible to spot errors automatically and on-line with high accuracy, in particular when focussing on combinations of cues. This kind of information may turn out to be highly relevant for spoken dialogue systems, e.g., by providing quantitative criteria for changing the dialogue strategy or speech recognition engine.
    Original languageEnglish
    Pages (from-to)19-30
    Number of pages12
    JournalInternational Journal of Speech Technology
    Volume4
    Issue number1
    DOIs
    Publication statusPublished - 2001

    Fingerprint Dive into the research topics of 'Error detection in spoken human-machine interaction'. Together they form a unique fingerprint.

  • Cite this