This paper demonstrates the potential of theoretically motivated learning methods in solving the problem of non-intrusive quality estimation for which the state-of-the-art is represented by ITU-T P.563 standard. To construct our estimator, we adopt the speech features from P.563, while we use a different mapping of features to form quality estimates. In contrast to P.563 which assumes distortion-classes to divide the feature space, our approach divides the feature space based on a clustering which is learned from the data using Bayesian inference. Despite using weaker modeling assumptions, we are still able to achieve comparable accuracy on predicting mean-opinion-scores with P.563. Our work suggests Bayesian model-evidence as an alternative metric to correlation-coefficient for determining the necessary number of experts for modeling the data.
|Title of host publication||Second International Workshop on Quality of Multimedia Experience, IEEE Signal Processing Society, 2010, 21-23 June, TrondheiM|
|Publication status||Published - 2010|