TY - GEN
T1 - Risk of Training Diagnostic Algorithms on Data with Demographic Bias
AU - Abbasi-Sureshjani, Samaneh
AU - Raumanns, Ralf
AU - Michels, Britt E.J.
AU - Schouten, Gerard
AU - Cheplygina, Veronika
PY - 2020
Y1 - 2020
N2 - One of the critical challenges in machine learning applications is to have fair predictions. There are numerous recent examples in various domains that convincingly show that algorithms trained with biased datasets can easily lead to erroneous or discriminatory conclusions. This is even more crucial in clinical applications where predictive algorithms are designed mainly based on a given set of medical images, and demographic variables such as age, sex and race are not taken into account. In this work, we conduct a survey of the MICCAI 2018 proceedings to investigate the common practice in medical image analysis applications. Surprisingly, we found that papers focusing on diagnosis rarely describe the demographics of the datasets used, and the diagnosis is purely based on images. In order to highlight the importance of considering the demographics in diagnosis tasks, we used a publicly available dataset of skin lesions. We then demonstrate that a classifier with an overall area under the curve (AUC) of 0.83 has variable performance between 0.76 and 0.91 on subgroups based on age and sex, even though the training set was relatively balanced. Moreover, we show that it is possible to learn unbiased features by explicitly using demographic variables in an adversarial training setup, which leads to balanced scores per subgroups. Finally, we discuss the implications of these results and provide recommendations for further research.
AB - One of the critical challenges in machine learning applications is to have fair predictions. There are numerous recent examples in various domains that convincingly show that algorithms trained with biased datasets can easily lead to erroneous or discriminatory conclusions. This is even more crucial in clinical applications where predictive algorithms are designed mainly based on a given set of medical images, and demographic variables such as age, sex and race are not taken into account. In this work, we conduct a survey of the MICCAI 2018 proceedings to investigate the common practice in medical image analysis applications. Surprisingly, we found that papers focusing on diagnosis rarely describe the demographics of the datasets used, and the diagnosis is purely based on images. In order to highlight the importance of considering the demographics in diagnosis tasks, we used a publicly available dataset of skin lesions. We then demonstrate that a classifier with an overall area under the curve (AUC) of 0.83 has variable performance between 0.76 and 0.91 on subgroups based on age and sex, even though the training set was relatively balanced. Moreover, we show that it is possible to learn unbiased features by explicitly using demographic variables in an adversarial training setup, which leads to balanced scores per subgroups. Finally, we discuss the implications of these results and provide recommendations for further research.
KW - Classification parity
KW - Computer-aided diagnosis
KW - Demographic bias
UR - https://www.scopus.com/pages/publications/85092940619
U2 - 10.1007/978-3-030-61166-8_20
DO - 10.1007/978-3-030-61166-8_20
M3 - Conference contribution
AN - SCOPUS:85092940619
SN - 9783030611651
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 183
EP - 192
BT - Interpretable and Annotation-Efficient Learning for Medical Image Computing - 3rd International Workshop, iMIMIC 2020, 2nd International Workshop, MIL3iD 2020, and 5th International Workshop, LABELS 2020, Held in Conjunction with MICCAI 2020, Proceedings
A2 - Cardoso, Jaime
A2 - Silva, Wilson
A2 - Cruz, Ricardo
A2 - Van Nguyen, Hien
A2 - Roysam, Badri
A2 - Heller, Nicholas
A2 - Henriques Abreu, Pedro
A2 - Pereira Amorim, Jose
A2 - Isgum, Ivana
A2 - Patel, Vishal
A2 - Zhou, Kevin
A2 - Jiang, Steve
A2 - Le, Ngan
A2 - Luu, Khoa
A2 - Sznitman, Raphael
A2 - Cheplygina, Veronika
A2 - Abbasi, Samaneh
A2 - Mateus, Diana
A2 - Trucco, Emanuele
PB - Springer
T2 - LABELS 2020
Y2 - 4 October 2020 through 8 October 2020
ER -