Abstract
Digital Subtraction Angiography (DSA) sequences are the gold standard for diagnosing most Cerebrovascular diseases (CVDs). Rapid and accurate recognition of CVDs in DSA sequences helps clinicians make the right decisions, which is important in clinical practice. However, the pathological characteristics of CVDs are numerous and complex, and the spatiotemporal complexity of DSA sequences is high, making the diagnosis of CVDs challenging. Therefore, in this paper, we propose a novel CVDs classification framework CLIP-DSA based on CLIP, a pre-trained vision language model. We aim to utilize textual knowledge to guide the robust classification of common CVDs in multi-view DSA sequences. Specifically, our CLIP-DSA comprises a dual-branch vision encoder and a text encoder. The vision encoder is used to extract features from multi-view sequences, while the text encoder is used to obtain textual knowledge. To optimally harness the temporal information in DSA sequences, we introduce a temporal pooling module that dynamically compresses image features in the time dimension. Additionally, we design a multi-view contrastive loss to enhance the network’s image-text representation ability by constraining the image features between two views. In a large dataset with 2,026 patients, the proposed CLIP-DSA achieved an AUC of 90.8% in the CVDs classification. The code is available at this website (https://github.com/jiongzhang-john/CLIP-DSA).
| Original language | English |
|---|---|
| Title of host publication | Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 |
| Subtitle of host publication | 28th International Conference, Daejeon, South Korea, September 23–27, 2025, Proceedings |
| Editors | James C. Gee, Daniel C. Alexander, Jaesung Hong, Juan Eugenio Iglesias, Carole H. Sudre, Archana Venkataraman, Polina Golland, Jong Hyo Kim, Jinah Park |
| Place of Publication | Cham |
| Publisher | Springer |
| Pages | 68-77 |
| Number of pages | 10 |
| Volume | VI |
| ISBN (Electronic) | 978-3-032-04978-0 |
| ISBN (Print) | 978-3-032-04977-3 |
| DOIs | |
| Publication status | Published - 19 Sept 2025 |
| Event | 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - Daejeon, Korea, Republic of Duration: 23 Sept 2025 → 27 Sept 2025 |
Publication series
| Name | Lecture Notes in Computer Science (LNCS) |
|---|---|
| Volume | 15965 |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 |
|---|---|
| Country/Territory | Korea, Republic of |
| City | Daejeon |
| Period | 23/09/25 → 27/09/25 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
Keywords
- Cerebrovascular Diseases
- Digital Subtraction Angiography
- Image-Text
- Vision Language Model
Fingerprint
Dive into the research topics of 'CLIP-DSA: Textual Knowledge-Guided Cerebrovascular Diseases Recognition in Multi-view Digital Subtraction Angiography'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver