A binaural scene analyzer for joint localization and recognition of speakers in the presence of interfering noise sources and reverberation

T. May, S.L.J.D.E. Par, van de, A.G. Kohlrausch

Research output: Contribution to journalArticleAcademicpeer-review

67 Citations (Scopus)
2 Downloads (Pure)

Abstract

In this study, we present a binaural scene analyzer that is able to simultaneously localize, detect and identify a known number of target speakers in the presence of spatially positioned noise sources and reverberation. In contrast to many other binaural cocktail party processors, the proposed system does not require a priori knowledge about the azimuth position of the target speakers. The proposed system consists of three main building blocks: binaural localization, speech source detection, and automatic speaker identification. First, a binaural front-end is used to robustly localize relevant sound source activity. Second, a speech detection module based on missing data classification is employed to determine whether detected sound source activity corresponds to a speaker or to an interfering noise source using a binary mask that is based on spatial evidence supplied by the binaural front-end. Third, a second missing data classifier is used to recognize the speaker identities of all detected speech sources. The proposed system is systematically evaluated in simulated adverse acoustic scenarios. Compared to state-of-the art MFCC recognizers, the proposed model achieves significant speaker recognition accuracy improvements.
Original languageEnglish
Pages (from-to)2016-2030
Number of pages15
JournalIEEE Transactions on Audio, Speech, and Language Processing
Volume20
Issue number7
DOIs
Publication statusPublished - 2012

Fingerprint

Dive into the research topics of 'A binaural scene analyzer for joint localization and recognition of speakers in the presence of interfering noise sources and reverberation'. Together they form a unique fingerprint.

Cite this