Abstract
Against the background of developments in the area of speech-based and multimodal interfaces, we present research on determining the addressee of an utterance in the context of mixed human-human and multimodal human-computer interaction. Working with data that are taken from realistic scenarios, we explore several features with respect to their relevance to the question who is the addressee of an utterance: eye gaze both of speaker and listener, dialogue history and utterance length. With respect to eye gaze, we inspect the detailed timing of shifts in eye gaze between different communication partners (human or computer). We show that these features result in an improved classification of utterances in terms of addressee-hood relative to a simple classification algorithm that assumes that "the addressee is where the eye is", and compare our results to alternative approaches.
Original language | English |
---|---|
Title of host publication | Proceedings of the 7th international conference on Multimodal interfaces, October 4-6, 2005, Torento, Italy |
Place of Publication | New York, USA |
Publisher | Association for Computing Machinery, Inc |
Pages | 175-182 |
ISBN (Print) | 1-59593-028-0 |
DOIs | |
Publication status | Published - 2005 |
Event | 7th International Conference on Multimodal Interfaces, ICMI 2005 - Torento, Italy Duration: 4 Oct 2005 → 6 Oct 2005 Conference number: 7 |
Conference
Conference | 7th International Conference on Multimodal Interfaces, ICMI 2005 |
---|---|
Country/Territory | Italy |
City | Torento |
Period | 4/10/05 → 6/10/05 |
Other | ICMI ’05, International conference on multimodal interfaces |