TY - JOUR
T1 - How to measure uncertainty in uncertainty sampling for active learning
AU - Nguyen, Vu-Linh
AU - Shaker, Mohammad Hossein
AU - Hüllermeier, Eyke
PY - 2022/1
Y1 - 2022/1
N2 - Various strategies for active learning have been proposed in the machine learning literature. In uncertainty sampling, which is among the most popular approaches, the active learner sequentially queries the label of those instances for which its current prediction is maximally uncertain. The predictions as well as the measures used to quantify the degree of uncertainty, such as entropy, are traditionally of a probabilistic nature. Yet, alternative approaches to capturing uncertainty in machine learning, alongside with corresponding uncertainty measures, have been proposed in recent years. In particular, some of these measures seek to distinguish different sources and to separate different types of uncertainty, such as the reducible (epistemic) and the irreducible (aleatoric) part of the total uncertainty in a prediction. The goal of this paper is to elaborate on the usefulness of such measures for uncertainty sampling, and to compare their performance in active learning. To this end, we instantiate uncertainty sampling with different measures, analyze the properties of the sampling strategies thus obtained, and compare them in an experimental study.
AB - Various strategies for active learning have been proposed in the machine learning literature. In uncertainty sampling, which is among the most popular approaches, the active learner sequentially queries the label of those instances for which its current prediction is maximally uncertain. The predictions as well as the measures used to quantify the degree of uncertainty, such as entropy, are traditionally of a probabilistic nature. Yet, alternative approaches to capturing uncertainty in machine learning, alongside with corresponding uncertainty measures, have been proposed in recent years. In particular, some of these measures seek to distinguish different sources and to separate different types of uncertainty, such as the reducible (epistemic) and the irreducible (aleatoric) part of the total uncertainty in a prediction. The goal of this paper is to elaborate on the usefulness of such measures for uncertainty sampling, and to compare their performance in active learning. To this end, we instantiate uncertainty sampling with different measures, analyze the properties of the sampling strategies thus obtained, and compare them in an experimental study.
KW - Active learning
KW - Aleatoric uncertainty
KW - Credal uncertainty
KW - Epistemic uncertainty
KW - Uncertainty sampling
UR - http://www.scopus.com/inward/record.url?scp=85108240027&partnerID=8YFLogxK
U2 - 10.1007/s10994-021-06003-9
DO - 10.1007/s10994-021-06003-9
M3 - Article
SN - 0885-6125
VL - 111
SP - 89
EP - 122
JO - Machine Learning
JF - Machine Learning
ER -