Doorgaan naar hoofdnavigatie Doorgaan naar zoeken Ga verder naar hoofdinhoud

Dynamic Data Pruning for Automatic Speech Recognition

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

57 Downloads (Pure)

Samenvatting

The recent success of Automatic Speech Recognition (ASR) is largely attributed to the ever-growing amount of training data. However, this trend has made model training prohibitively costly and imposed computational demands. While data pruning has been proposed to mitigate this issue by identifying a small subset of relevant data, its application in ASR has been barely explored, and existing works often entail significant overhead to achieve meaningful results. To fill this gap, this paper presents the first investigation of dynamic data pruning for ASR, finding that we can reach the full-data performance by dynamically selecting 70% of data. Furthermore, we introduce Dynamic Data Pruning for ASR (DDP-ASR), which offers several fine-grained pruning granularities specifically tailored for speech-related datasets, going beyond the conventional pruning of entire time sequences. Our intensive experiments show that DDP-ASR can save up to 1.6x training time with negligible performance loss.
Originele taal-2Engels
TitelInterspeech 2024
SubtitelKos, Greece 1-5 September 2024
UitgeverijISCA
Pagina's4488-4492
Aantal pagina's5
DOI's
StatusGepubliceerd - 2024
Evenement25th Interspeech Conference, Interspeech 2024 - Kos, Griekenland
Duur: 1 sep. 20245 sep. 2024

Congres

Congres25th Interspeech Conference, Interspeech 2024
Verkorte titelInterspeech 2024
Land/RegioGriekenland
StadKos
Periode1/09/245/09/24

Financiering

This research is part of the research program 'MegaMind-Measuring, Gathering, Mining and Integrating Data for Self-management in the Edge of the Electricity System', (partly) financed by the Dutch Research Council (NWO) through the Perspectief program under number P19-25.Additionally, note that only non-Meta authors utilized and processed the datasets (and no dataset pre-processing or processing took place on Meta's servers or facilities).Shiwei Liu is supported by the Royal Society with the Newton International Fellowship.

Vingerafdruk

Duik in de onderzoeksthema's van 'Dynamic Data Pruning for Automatic Speech Recognition'. Samen vormen ze een unieke vingerafdruk.

Citeer dit