Detection of frame informativeness in endoscopic videos using image quality and recurrent neural networks

T.G.W. Boers, J. van der Putten, J. de Groof, M. Struyvenberg, K. Fockens, W. Curvers, E. Schoon, F. van der Sommen, J. Bergman, P. H.N. de With

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

3 Citaten (Scopus)

Samenvatting

Gastroenterologists are estimated to misdiagnose up to 25% of esophageal adenocarcinomas in Barrett's Esophagus patients. This prompts the need for more sensitive and objective tools to aid clinicians with lesion detection. Artificial Intelligence (AI) can make examinations more objective and will therefore help to mitigate the observer dependency. Since these models are trained with good-quality endoscopic video frames to attain high efficacy, high-quality images are also needed for inference. Therefore, we aim to develop a framework that is able to distinguish good image quality by a-priori informativeness classification which leads to high inference robustness. We show that we can maintain informativeness over the temporal domain using recurrent neural networks, yielding a higher performance on non-informativeness detection compared to classifying individual images. Furthermore, it is also found that by using Gradient weighted Class Activation Map (Grad-CAM), we can better localize informativeness within a frame. We have developed a customized Resnet18 feature extractor with 3 classifiers, consisting of a Fully-Connected (FC), Long-Short-Term-Memory (LSTM) and a Gated-Recurrent-Unit (GRU) classifier. Experimental results are based on 4,349 frames from 20 pullback videos of the esophagus. Our results demonstrate that the algorithm achieves comparative performance with the current state-of-the-art. The FC and LSTM classifier reach an F1 score of 91% and 91%. We found that the LSTM classifier based Grad-CAMs represent the origin of non-informativeness the best as 85% of the images were found to be highlighting the correct area. The benefit of our novel implementation for endoscopic informativeness classification is that it is trained end-to-end, incorporates the spatio-temporal domain in the decision making for robustness, and makes the model decisions of the model insightful with the use of Grad-CAMs.

Originele taal-2Engels
TitelMedical Imaging 2020
SubtitelImage Processing
RedacteurenIvana Isgum, Bennett A. Landman
UitgeverijSPIE
Aantal pagina's6
ISBN van elektronische versie9781510633933
DOI's
StatusGepubliceerd - 2020
EvenementSPIE Medical Imaging 2020 - Houston, Verenigde Staten van Amerika
Duur: 15 feb. 202020 feb. 2020

Publicatie series

NaamProceedings of SPIE
Volume11313
ISSN van geprinte versie1605-7422

Congres

CongresSPIE Medical Imaging 2020
Land/RegioVerenigde Staten van Amerika
StadHouston
Periode15/02/2020/02/20

Bibliografische nota

Publisher Copyright:
© 2020 SPIE. All rights reserved.

Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

Vingerafdruk

Duik in de onderzoeksthema's van 'Detection of frame informativeness in endoscopic videos using image quality and recurrent neural networks'. Samen vormen ze een unieke vingerafdruk.

Citeer dit