Validation and comparison of a single and multi-institutional automated segmentation algorithm for vestibular schwannoma from contrast-enhanced T1-weighted MRI scans

Stefan Cornelissen, P.P.J.H. Langenhuizen, Sammy M. Schouten, Jonathan Shapey, Aaron Kujawa, Reuben Dorent, Tom Vercauteren, Henricus P.M. Kunst, H.B. Verheul, Peter H.N. de With

Research output: Contribution to journalMeeting AbstractAcademic

Abstract

Introduction
In an earlier study by King’s College London (KCL), a framework for the automatic segmentation of vestibular schwannomas (VS) from MRI scans was proposed. They demonstrated that artificial intelligence is capable of annotating and calculating VS tumor volumes, which can benefit tumor progression monitoring. In this study, their algorithm is validated using data from another institution. The obtained results are subsequently compared to a multi-institutional model, in which the data from both centers is combined.

Methods
In addition to the original KCL dataset of 375 contrast-enhanced T1-weighted (ceT1) MRI scans, a total of 1,115 ceT1 scans of individual VS patients from the ETZ hospital in Tilburg, the Netherlands, were used. All tumors were manually annotated by a neurosurgeon at time of treatment. Employing the original 2.5D convolutional neural network, two different models were subsequently trained. First, one model is trained solely on 176 scans from the KCL dataset, using an additional 20 and 46 scans for hyper-parameter tuning and internal validation, respectively. This single-institution model is externally validated on the entire ETZ dataset. The second model employed 242 scans from the KCL dataset and all of the ETZ data from before 2015 (733 scans), with the remainder of the ETZ data equally distributed between a tuning and validation set. The remaining 133 scans from the KCL dataset were also used for validation.

Results
The single-institutional model achieved internal and external validation mean Dice scores of 92.0±5.1% and 64.5±32.4%, respectively. The external validation set included 175 scans where the model failed to recognize any VS, resulting in a Dice score of zero for these scans, which skews the results. The second model yielded validation mean dice scores of 92.5±5.1% for the ETZ scans and 89.1±9.6% for the KCL scans. During validation, most of the low-scoring segmentations are of tumors that are either very small or cystic.

Conclusions
The significant difference in performance between internal and external validation of the first model, thereby surmising a poor generalization, can be explained by variations in spatial resolution and image acquisition parameters between the two datasets. The second model shows performances approaching inter-observer variability of human annotators (cf. 93.8±3.1%). The results show that creating a robust and well-generalizing model from single-institution data is challenging. However, the multi-institutional results empower the earlier demonstrated capabilities of the framework for the automatic segmentation of VS.
Original languageEnglish
Pages (from-to)60
JournalJournal of Radiosurgery & SBRT
Volume8
Issue numbersuppl. 1
Publication statusPublished - 22 Sept 2022
Event15th International Stereotactic Radiosurgery Society Congress, ISRS 2022: Focal is Better - Milano Convention Centre, Milan, Italy
Duration: 19 Jun 202223 Jun 2022
Conference number: 15
https://isrscongress.org/

Keywords

  • vestibular schwannoma
  • deep learning
  • segmentation

Fingerprint

Dive into the research topics of 'Validation and comparison of a single and multi-institutional automated segmentation algorithm for vestibular schwannoma from contrast-enhanced T1-weighted MRI scans'. Together they form a unique fingerprint.

Cite this