Target Speaker Selection for Neural Network Beamforming in Multi-Speaker Scenarios

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

We propose a speaker selection mechanism (SSM) to enhance the training of a beamforming neural network. Our approach is motivated by the observation that listeners typically orient themselves toward the target speaker at a slight undershot angle. The mechanism enables the neural network to learn which speaker to focus on in multi-speaker scenarios, based on the relative positions of the listener and speakers. Importantly, only audio input is required during inference. We conduct acoustic simulations to evaluate the effectiveness of the SSM, demonstrating its impact on performance. Results show significant increase in speech intelligibility, quality, and distortion metrics, outperforming both the ideal minimum variance distortionless filter and the same neural network model trained without SSM.
Original languageEnglish
Title of host publicationEuropean Signal Processing Conference, EUSIPCO
PublisherInstitute of Electrical and Electronics Engineers
Pages166-170
Number of pages5
ISBN (Electronic)978-9-4645-9362-4
Publication statusPublished - 17 Nov 2025
Event33rd European Signal Processing Conference 2025, EUSIPCO - Palermo, Italy
Duration: 8 Sept 202512 Sept 2025

Conference

Conference33rd European Signal Processing Conference 2025, EUSIPCO
Abbreviated titleEUSIPCO
Country/TerritoryItaly
CityPalermo
Period8/09/2512/09/25

Fingerprint

Dive into the research topics of 'Target Speaker Selection for Neural Network Beamforming in Multi-Speaker Scenarios'. Together they form a unique fingerprint.

Cite this