TY - JOUR
T1 - Deep learning-based grading of ductal carcinoma in situ in breast histopathology images
AU - Wetstein, Suzanne C.
AU - Stathonikos, Nikolas
AU - Pluim, Josien P.W.
AU - Heng, Yujing J.
AU - ter Hoeve, Natalie D.
AU - Vreuls, Celien P.H.
AU - van Diest, Paul J.
AU - Veta, Mitko
N1 - Funding Information:
Acknowledgements This work was supported by the Deep Learning for Medical Image Analysis research program by The Dutch Research Council P15–26 and Philips Research (SCW, MV and JPWP).
PY - 2021/4
Y1 - 2021/4
N2 - Ductal carcinoma in situ (DCIS) is a non-invasive breast cancer that can progress into invasive ductal carcinoma (IDC). Studies suggest DCIS is often overtreated since a considerable part of DCIS lesions may never progress into IDC. Lower grade lesions have a lower progression speed and risk, possibly allowing treatment de-escalation. However, studies show significant inter-observer variation in DCIS grading. Automated image analysis may provide an objective solution to address high subjectivity of DCIS grading by pathologists. In this study, we developed and evaluated a deep learning-based DCIS grading system. The system was developed using the consensus DCIS grade of three expert observers on a dataset of 1186 DCIS lesions from 59 patients. The inter-observer agreement, measured by quadratic weighted Cohen’s kappa, was used to evaluate the system and compare its performance to that of expert observers. We present an analysis of the lesion-level and patient-level inter-observer agreement on an independent test set of 1001 lesions from 50 patients. The deep learning system (dl) achieved on average slightly higher inter-observer agreement to the three observers (o1, o2 and o3) (κo1,dl= 0.81, κo2,dl= 0.53 and κo3,dl= 0.40) than the observers amongst each other (κo1,o2= 0.58, κo1,o3= 0.50 and κo2,o3= 0.42) at the lesion-level. At the patient-level, the deep learning system achieved similar agreement to the observers (κo1,dl= 0.77, κo2,dl= 0.75 and κo3,dl= 0.70) as the observers amongst each other (κo1,o2= 0.77, κo1,o3= 0.75 and κo2,o3= 0.72). The deep learning system better reflected the grading spectrum of DCIS than two of the observers. In conclusion, we developed a deep learning-based DCIS grading system that achieved a performance similar to expert observers. To the best of our knowledge, this is the first automated system for the grading of DCIS that could assist pathologists by providing robust and reproducible second opinions on DCIS grade.
AB - Ductal carcinoma in situ (DCIS) is a non-invasive breast cancer that can progress into invasive ductal carcinoma (IDC). Studies suggest DCIS is often overtreated since a considerable part of DCIS lesions may never progress into IDC. Lower grade lesions have a lower progression speed and risk, possibly allowing treatment de-escalation. However, studies show significant inter-observer variation in DCIS grading. Automated image analysis may provide an objective solution to address high subjectivity of DCIS grading by pathologists. In this study, we developed and evaluated a deep learning-based DCIS grading system. The system was developed using the consensus DCIS grade of three expert observers on a dataset of 1186 DCIS lesions from 59 patients. The inter-observer agreement, measured by quadratic weighted Cohen’s kappa, was used to evaluate the system and compare its performance to that of expert observers. We present an analysis of the lesion-level and patient-level inter-observer agreement on an independent test set of 1001 lesions from 50 patients. The deep learning system (dl) achieved on average slightly higher inter-observer agreement to the three observers (o1, o2 and o3) (κo1,dl= 0.81, κo2,dl= 0.53 and κo3,dl= 0.40) than the observers amongst each other (κo1,o2= 0.58, κo1,o3= 0.50 and κo2,o3= 0.42) at the lesion-level. At the patient-level, the deep learning system achieved similar agreement to the observers (κo1,dl= 0.77, κo2,dl= 0.75 and κo3,dl= 0.70) as the observers amongst each other (κo1,o2= 0.77, κo1,o3= 0.75 and κo2,o3= 0.72). The deep learning system better reflected the grading spectrum of DCIS than two of the observers. In conclusion, we developed a deep learning-based DCIS grading system that achieved a performance similar to expert observers. To the best of our knowledge, this is the first automated system for the grading of DCIS that could assist pathologists by providing robust and reproducible second opinions on DCIS grade.
UR - http://www.scopus.com/inward/record.url?scp=85101283524&partnerID=8YFLogxK
U2 - 10.1038/s41374-021-00540-6
DO - 10.1038/s41374-021-00540-6
M3 - Article
C2 - 33608619
AN - SCOPUS:85101283524
SN - 0023-6837
VL - 101
SP - 525
EP - 533
JO - Laboratory Investigation
JF - Laboratory Investigation
IS - 4
ER -