Psychoacoustical evaluation of PSOLA. II. Double-formant stimuli and the role of vocal perturbation

R.W.L. Kortekaas, A.G. Kohlrausch

Research output: Contribution to journalArticleAcademicpeer-review

12 Citations (Scopus)
1 Downloads (Pure)

Abstract

This article presents the results of listening experiments and psychoacoustical modeling aimed at evaluating the pitch synchronous overlap-and-add (PSOLA) technique. This technique can be used for simultaneous modification of pitch and duration of natural speech, using simple and efficient time-domain operations on the speech waveform. The first set of experiments tested the ability of subjects to discriminate double-formant stimuli, modified in fundamental frequency using PSOLA, from unmodified stimuli. Of the potential auditory discrimination cues induced by PSOLA, cues from the first formant were found to generally dominate discrimination performance. In the second set of experiments the influence of vocal perturbation, i.e., jitter and shimmer, on discriminability of PSOLA-modified single-formant stimuli was determined. The data show that discriminability deteriorates at most modestly in the presence of jitter and shimmer. With the exception of a few conditions, the trends in these data could be replicated by either using a modulation-discrimination or an intensity-discrimination model, dependent on the formant frequency. As a baseline experiment detection thresholds for jitter and shimmer were measured. Thresholds for jitter could be replicated by using either the modulation-discrimination or the intensity-discrimination model, dependent on the (mean) fundamental frequency of stimuli. The thresholds for shimmer could be accurately predicted for stimuli with a 250-Hz fundamental, but less accurately in the case of a 100-Hz fundamental
Original languageEnglish
Pages (from-to)522-535
Number of pages13
JournalJournal of the Acoustical Society of America
Volume105
Issue number1
DOIs
Publication statusPublished - 1999

Fingerprint

Dive into the research topics of 'Psychoacoustical evaluation of PSOLA. II. Double-formant stimuli and the role of vocal perturbation'. Together they form a unique fingerprint.

Cite this