The irrelevant speech effect was investigated in this study where the serial-recall task was performed under six different conditions: Silence, speech-only, noise-only, speech masked by a stationary noise at two different signal-to-noise ratios (SNRs), and speech masked by an adaptive noise. Measured in five test blocks distributed throughout the four test days, the error rate of the serial-recall task under the silence condition sharply decreased in the first few test blocks, halved after completing about seven blocks. When the adaptive masking scheme was used, the error rate of the serial-recall test was reduced compared to the speech-only condition (by 9%) and to the lower-SNR stationary noise (by 4.4%). However, the serial-recall performance was not significantly different between the stationary and the adaptive maskers when the average sound level was carefully matched. Speech Transmission Index (STI) and the correlation coefficient of power spectra were used as the estimators of the temporal and spectral distinctiveness between sound tokens, respectively. The comparison to the test results implied that the frequency-domain estimator may be a better predictor of the relative ISE especially for a non-stationary masker, although it was also suggested that such estimators may have to be combined possibly with an appropriate weighting.