TY - GEN
T1 - Exploring the Effect of Dataset Diversity in Self-supervised Learning for Surgical Computer Vision
AU - Jaspers, Tim J.M.
AU - de Jong, Ronald L.P.D.
AU - Al Khalil, Yasmina
AU - Zeelenberg, Tijn
AU - Kusters, Koen
AU - Li, Yiping
AU - van Jaarsveld, Romy C.
AU - Bakker, Franciscus H.A.
AU - Ruurda, Jelle P.
AU - Brinkman, Willem M.
AU - de With, Peter H.N.
AU - van der Sommen, Fons
PY - 2024/10/25
Y1 - 2024/10/25
N2 - Over the past decade, computer vision applications in minimally invasive surgery have rapidly increased. Despite this growth, the impact of surgical computer vision remains limited compared to other medical fields like pathology and radiology, primarily due to the scarcity of representative annotated data. Whereas transfer learning from large annotated datasets such as ImageNet has been conventionally the norm to achieve high-performing models, recent advancements in self-supervised learning (SSL) have demonstrated superior performance. In medical image analysis, in-domain SSL pretraining has already been shown to outperform ImageNet-based initialization. Although unlabeled data in the field of surgical computer vision is abundant, the diversity within this data is limited. This study investigates the role of dataset diversity in SSL for surgical computer vision, comparing procedure-specific datasets against a more heterogeneous general surgical dataset across three different downstream surgical applications. The obtained results show that using solely procedure-specific data can lead to substantial improvements of 13.8%, 9.5%, and 36.8% compared to ImageNet pretraining. However, extending this data with more heterogeneous surgical data further increases performance by an additional 5.0%, 5.2%, and 2.5%, suggesting that increasing diversity within SSL data is beneficial for model performance. The code and pretrained model weights are made publicly available at https://github.com/TimJaspers0801/SurgeNet.
AB - Over the past decade, computer vision applications in minimally invasive surgery have rapidly increased. Despite this growth, the impact of surgical computer vision remains limited compared to other medical fields like pathology and radiology, primarily due to the scarcity of representative annotated data. Whereas transfer learning from large annotated datasets such as ImageNet has been conventionally the norm to achieve high-performing models, recent advancements in self-supervised learning (SSL) have demonstrated superior performance. In medical image analysis, in-domain SSL pretraining has already been shown to outperform ImageNet-based initialization. Although unlabeled data in the field of surgical computer vision is abundant, the diversity within this data is limited. This study investigates the role of dataset diversity in SSL for surgical computer vision, comparing procedure-specific datasets against a more heterogeneous general surgical dataset across three different downstream surgical applications. The obtained results show that using solely procedure-specific data can lead to substantial improvements of 13.8%, 9.5%, and 36.8% compared to ImageNet pretraining. However, extending this data with more heterogeneous surgical data further increases performance by an additional 5.0%, 5.2%, and 2.5%, suggesting that increasing diversity within SSL data is beneficial for model performance. The code and pretrained model weights are made publicly available at https://github.com/TimJaspers0801/SurgeNet.
U2 - 10.1007/978-3-031-73748-0_5
DO - 10.1007/978-3-031-73748-0_5
M3 - Conference contribution
SN - 978-3-031-73747-3
T3 - Lecture Notes in Computer Science (LNCS)
SP - 43
EP - 53
BT - Data Engineering in Medical Imaging
A2 - Bhattarai, Binod
A2 - Ali, Sharib
A2 - Rau, Anita
A2 - Caramalau, Razvan
A2 - Nguyen, Anh
A2 - Gyawali, Prashnna
A2 - Namburete, Ana
A2 - Stoyanov, Daniel
PB - Springer
CY - Cham
T2 - 2nd MICCAI Workshop on Data Engineering in Medical Imaging, DEMI 2024
Y2 - 10 October 2024 through 10 October 2024
ER -