TY - JOUR
T1 - From detection of individual metastases to classification of lymph node status at the patient level
T2 - the CAMELYON17 challenge
AU - Bandi, Peter
AU - Geessink, Oscar
AU - Manson, Quirine
AU - van Dijk, Marcory
AU - Balkenhol, Maschenka
AU - Hermsen, Meyke
AU - Bejnordi, Babak Ehteshami
AU - Lee, Byungjae
AU - Paeng, Kyunghyun
AU - Zhong, Aoxiao
AU - Li, Quanzheng
AU - Zanjani, Farhad Ghazvinian
AU - Zinger, Svitlana
AU - Fukuta, Keisuke
AU - Komura, Daisuke
AU - Ovtcharov, Vlado
AU - Cheng, Shenghua
AU - Zeng, Shaoqun
AU - Thagaard, Jeppe
AU - Dahl, Anders B.
AU - Lin, Huangjing
AU - Chen, Hao
AU - Jacobsson, Ludwig
AU - Hedlund, Martin
AU - Cetin, Melih
AU - Halici, Eren
AU - Jackson, Hunter
AU - Chen, Richard
AU - Both, Fabian
AU - Franke, Jorg
AU - Kusters-Vandevelde, Heidi
AU - Vreuls, Willem
AU - Bult, Peter
AU - van Ginneken, Bram
AU - van der Laak, Jeroen
AU - Litjens, Geert
PY - 2019/2
Y1 - 2019/2
N2 - Automated detection of cancer metastases in lymph nodes has the potential to improve the assessment of prognosis for patients. To enable fair comparison between the algorithms for this purpose, we set up the CAMELYON17 challenge in conjunction with the IEEE International Symposium on Biomedical Imaging 2017 Conference in Melbourne. Over 300 participants registered on the challenge website, of which 23 teams submitted a total of 37 algorithms before the initial deadline. Participants were provided with 899 whole-slide images (WSIs) for developing their algorithms. The developed algorithms were evaluated based on the test set encompassing 100 patients and 500 WSIs. The evaluation metric used was a quadratic weighted Cohen's kappa. We discuss the algorithmic details of the 10 best pre-conference and two post-conference submissions. All these participants used convolutional neural networks in combination with pre- and postprocessing steps. Algorithms differed mostly in neural network architecture, training strategy, and pre- and postprocessing methodology. Overall, the kappa metric ranged from 0.89 to -0.13 across all submissions. The best results were obtained with pre-trained architectures such as ResNet. Confusion matrix analysis revealed that all participants struggled with reliably identifying isolated tumor cells, the smallest type of metastasis, with detection rates below 40%. Qualitative inspection of the results of the top participants showed categories of false positives, such as nerves or contamination, which could be targeted for further optimization. Last, we show that simple combinations of the top algorithms result in higher kappa metric values than any algorithm individually, with 0.93 for the best combination.
AB - Automated detection of cancer metastases in lymph nodes has the potential to improve the assessment of prognosis for patients. To enable fair comparison between the algorithms for this purpose, we set up the CAMELYON17 challenge in conjunction with the IEEE International Symposium on Biomedical Imaging 2017 Conference in Melbourne. Over 300 participants registered on the challenge website, of which 23 teams submitted a total of 37 algorithms before the initial deadline. Participants were provided with 899 whole-slide images (WSIs) for developing their algorithms. The developed algorithms were evaluated based on the test set encompassing 100 patients and 500 WSIs. The evaluation metric used was a quadratic weighted Cohen's kappa. We discuss the algorithmic details of the 10 best pre-conference and two post-conference submissions. All these participants used convolutional neural networks in combination with pre- and postprocessing steps. Algorithms differed mostly in neural network architecture, training strategy, and pre- and postprocessing methodology. Overall, the kappa metric ranged from 0.89 to -0.13 across all submissions. The best results were obtained with pre-trained architectures such as ResNet. Confusion matrix analysis revealed that all participants struggled with reliably identifying isolated tumor cells, the smallest type of metastasis, with detection rates below 40%. Qualitative inspection of the results of the top participants showed categories of false positives, such as nerves or contamination, which could be targeted for further optimization. Last, we show that simple combinations of the top algorithms result in higher kappa metric values than any algorithm individually, with 0.93 for the best combination.
KW - Biomedical imaging
KW - breast cancer
KW - grand challenge
KW - Hospitals
KW - lymph node metastases
KW - Lymph nodes
KW - Metastasis
KW - Pathology
KW - sentinel lymph node
KW - Tumors
KW - whole-slide images
KW - Breast cancer
UR - http://www.scopus.com/inward/record.url?scp=85052677567&partnerID=8YFLogxK
U2 - 10.1109/TMI.2018.2867350
DO - 10.1109/TMI.2018.2867350
M3 - Article
C2 - 30716025
SN - 0278-0062
VL - 38
SP - 550
EP - 560
JO - IEEE Transactions on Medical Imaging
JF - IEEE Transactions on Medical Imaging
IS - 2
M1 - 8447230
ER -