Risk of Training Diagnostic Algorithms on Data with Demographic Bias

Samaneh Abbasi-Sureshjani, Ralf Raumanns, Britt E.J. Michels, Gerard Schouten, Veronika Cheplygina

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

10 Citations (Scopus)


One of the critical challenges in machine learning applications is to have fair predictions. There are numerous recent examples in various domains that convincingly show that algorithms trained with biased datasets can easily lead to erroneous or discriminatory conclusions. This is even more crucial in clinical applications where predictive algorithms are designed mainly based on a given set of medical images, and demographic variables such as age, sex and race are not taken into account. In this work, we conduct a survey of the MICCAI 2018 proceedings to investigate the common practice in medical image analysis applications. Surprisingly, we found that papers focusing on diagnosis rarely describe the demographics of the datasets used, and the diagnosis is purely based on images. In order to highlight the importance of considering the demographics in diagnosis tasks, we used a publicly available dataset of skin lesions. We then demonstrate that a classifier with an overall area under the curve (AUC) of 0.83 has variable performance between 0.76 and 0.91 on subgroups based on age and sex, even though the training set was relatively balanced. Moreover, we show that it is possible to learn unbiased features by explicitly using demographic variables in an adversarial training setup, which leads to balanced scores per subgroups. Finally, we discuss the implications of these results and provide recommendations for further research.

Original languageEnglish
Title of host publicationInterpretable and Annotation-Efficient Learning for Medical Image Computing - 3rd International Workshop, iMIMIC 2020, 2nd International Workshop, MIL3iD 2020, and 5th International Workshop, LABELS 2020, Held in Conjunction with MICCAI 2020, Proceedings
EditorsJaime Cardoso, Wilson Silva, Ricardo Cruz, Hien Van Nguyen, Badri Roysam, Nicholas Heller, Pedro Henriques Abreu, Jose Pereira Amorim, Ivana Isgum, Vishal Patel, Kevin Zhou, Steve Jiang, Ngan Le, Khoa Luu, Raphael Sznitman, Veronika Cheplygina, Samaneh Abbasi, Diana Mateus, Emanuele Trucco
Number of pages10
ISBN (Print)9783030611651
Publication statusPublished - 2020
EventLABELS 2020 - Lima, Peru
Duration: 4 Oct 20208 Oct 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12446 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferenceLABELS 2020
Otherheld in conjunction with the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2020


  • Classification parity
  • Computer-aided diagnosis
  • Demographic bias


Dive into the research topics of 'Risk of Training Diagnostic Algorithms on Data with Demographic Bias'. Together they form a unique fingerprint.

Cite this