CLIP-DSA: Textual Knowledge-Guided Cerebrovascular Diseases Recognition in Multi-view Digital Subtraction Angiography

  • Qihang Xie
  • , Dan Zhang (Corresponding author-nrf)
  • , Mengting Liu
  • , Jianwei Zhang
  • , Ruisheng Su
  • , Caifeng Shan
  • , Jiong Zhang (Corresponding author-nrf)

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Digital Subtraction Angiography (DSA) sequences are the gold standard for diagnosing most Cerebrovascular diseases (CVDs). Rapid and accurate recognition of CVDs in DSA sequences helps clinicians make the right decisions, which is important in clinical practice. However, the pathological characteristics of CVDs are numerous and complex, and the spatiotemporal complexity of DSA sequences is high, making the diagnosis of CVDs challenging. Therefore, in this paper, we propose a novel CVDs classification framework CLIP-DSA based on CLIP, a pre-trained vision language model. We aim to utilize textual knowledge to guide the robust classification of common CVDs in multi-view DSA sequences. Specifically, our CLIP-DSA comprises a dual-branch vision encoder and a text encoder. The vision encoder is used to extract features from multi-view sequences, while the text encoder is used to obtain textual knowledge. To optimally harness the temporal information in DSA sequences, we introduce a temporal pooling module that dynamically compresses image features in the time dimension. Additionally, we design a multi-view contrastive loss to enhance the network’s image-text representation ability by constraining the image features between two views. In a large dataset with 2,026 patients, the proposed CLIP-DSA achieved an AUC of 90.8% in the CVDs classification. The code is available at this website (https://github.com/jiongzhang-john/CLIP-DSA).

Original languageEnglish
Title of host publicationMedical Image Computing and Computer Assisted Intervention, MICCAI 2025
Subtitle of host publication28th International Conference, Daejeon, South Korea, September 23–27, 2025, Proceedings
EditorsJames C. Gee, Daniel C. Alexander, Jaesung Hong, Juan Eugenio Iglesias, Carole H. Sudre, Archana Venkataraman, Polina Golland, Jong Hyo Kim, Jinah Park
Place of PublicationCham
PublisherSpringer
Pages68-77
Number of pages10
VolumeVI
ISBN (Electronic)978-3-032-04978-0
ISBN (Print)978-3-032-04977-3
DOIs
Publication statusPublished - 19 Sept 2025
Event28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - Daejeon, Korea, Republic of
Duration: 23 Sept 202527 Sept 2025

Publication series

NameLecture Notes in Computer Science (LNCS)
Volume15965
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
Country/TerritoryKorea, Republic of
CityDaejeon
Period23/09/2527/09/25

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

Keywords

  • Cerebrovascular Diseases
  • Digital Subtraction Angiography
  • Image-Text
  • Vision Language Model

Fingerprint

Dive into the research topics of 'CLIP-DSA: Textual Knowledge-Guided Cerebrovascular Diseases Recognition in Multi-view Digital Subtraction Angiography'. Together they form a unique fingerprint.

Cite this