TY - GEN
T1 - Evaluating How Explainable AI Is Perceived in the Medical Domain
T2 - 2nd International Workshop on Trustworthy Artificial Intelligence for Healthcare, TAI4H 2024
AU - Karagoz, Gizem
AU - van Kollenburg, Geert
AU - Ozcelebi, Tanir
AU - Meratnia, Nirvana
PY - 2024/8/1
Y1 - 2024/8/1
N2 - The crucial role of Explainable Artificial Intelligence (XAI) in healthcare is underscored by the need for both accurate diagnosis and transparency of decision making to improve trust in the decisions on the one hand and to facilitate its adoption by medical professionals on the other hand. In this paper, We present results of a quantitative user study to evaluate how widely used XAI methods are perceived by medical experts. For doing so, we utilize two prominent post-hoc model-agnostic XAI methods, i.e., Local Interpretable Model-agnostic Explanations (LIME) and Shapley Additive explanations (SHAP). For this study, a considerable cohort of 97 medical experts was recruited to investigate whether these XAI methods assist the medical experts in their diagnosis on Chest X-ray scans. We designed an evaluation framework to investigate diagnosis accuracy, trust change, coherence with expert reasoning, and confidence differences before and after seeing provided explanations of XAI. This large-scale study showed that both XAI methods improve scores on indicative explanations. The overall change in trust was not significantly different across LIME and SHAP, indicating that, there are other factors for trust enhancement in AI diagnostics beyond providing explanations. This work has proposed a robust, human-centered benchmark, supporting the research and development of interpretable, reliable, and clinically-aligned AI tools, and directing the future of AI in high-stakes healthcare applications towards enhanced transparency and accountability.
AB - The crucial role of Explainable Artificial Intelligence (XAI) in healthcare is underscored by the need for both accurate diagnosis and transparency of decision making to improve trust in the decisions on the one hand and to facilitate its adoption by medical professionals on the other hand. In this paper, We present results of a quantitative user study to evaluate how widely used XAI methods are perceived by medical experts. For doing so, we utilize two prominent post-hoc model-agnostic XAI methods, i.e., Local Interpretable Model-agnostic Explanations (LIME) and Shapley Additive explanations (SHAP). For this study, a considerable cohort of 97 medical experts was recruited to investigate whether these XAI methods assist the medical experts in their diagnosis on Chest X-ray scans. We designed an evaluation framework to investigate diagnosis accuracy, trust change, coherence with expert reasoning, and confidence differences before and after seeing provided explanations of XAI. This large-scale study showed that both XAI methods improve scores on indicative explanations. The overall change in trust was not significantly different across LIME and SHAP, indicating that, there are other factors for trust enhancement in AI diagnostics beyond providing explanations. This work has proposed a robust, human-centered benchmark, supporting the research and development of interpretable, reliable, and clinically-aligned AI tools, and directing the future of AI in high-stakes healthcare applications towards enhanced transparency and accountability.
KW - Explainable AI
KW - Human-Centered Evaluation
KW - Medical Imaging
KW - XAI Evaluation
KW - XAI in Healthcare
UR - http://www.scopus.com/inward/record.url?scp=85201190554&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-67751-9_8
DO - 10.1007/978-3-031-67751-9_8
M3 - Conference contribution
AN - SCOPUS:85201190554
SN - 978-3-031-67750-2
T3 - Lecture Notes in Computer Science (LNCS)
SP - 92
EP - 108
BT - Trustworthy Artificial Intelligence for Healthcare
A2 - Chen, Hao
A2 - Zhou, Yuyin
A2 - Xu, Daguang
A2 - Vardhanabhuti, Varut Vince
PB - Springer
CY - Cham
Y2 - 4 August 2024 through 4 August 2024
ER -