Appropriate trust in artificial intelligence for the optical diagnosis of colorectal polyps: The role of human/artificial intelligence interaction

Quirine E.W. van der Zander (Corresponding author), Rachel Roumans, Carolus H.J. Kusters, Nikoo Dehghani, Ad A.M. Masclee, Peter H.N. de With, Fons van der Sommen, Chris C.P. Snijders, Erik J. Schoon

Research output: Contribution to journalArticleAcademicpeer-review

14 Citations (Scopus)
25 Downloads (Pure)

Abstract

Background and Aims: Computer-aided diagnosis (CADx) for the optical diagnosis of colorectal polyps is thoroughly investigated. However, studies on human–artificial intelligence interaction are lacking. Our aim was to investigate endoscopists’ trust in CADx by evaluating whether communicating a calibrated algorithm confidence score improved trust. Methods: Endoscopists optically diagnosed 60 colorectal polyps. Initially, endoscopists diagnosed the polyps without CADx assistance (initial diagnosis). Immediately afterward, the same polyp was again shown with a CADx prediction: either only a prediction (benign or premalignant) or a prediction accompanied by a calibrated confidence score (0-100). A confidence score of 0 indicated a benign prediction, 100 a (pre)malignant prediction. In half of the polyps, CADx was mandatory, and for the other half, CADx was optional. After reviewing the CADx prediction, endoscopists made a final diagnosis. Histopathology was used as the gold standard. Endoscopists’ trust in CADx was measured as CADx prediction utilization: the willingness to follow CADx predictions when the endoscopists initially disagreed with the CADx prediction. Results: Twenty-three endoscopists participated. Presenting CADx predictions increased the endoscopists’ diagnostic accuracy (69.3% initial vs 76.6% final diagnosis, P < .001). The CADx prediction was used in 36.5% (n = 183 of 501) disagreements. Adding a confidence score led to lower CADx prediction utilization, except when the confidence score surpassed 60. Mandatory CADx decreased CADx prediction utilization compared to optional CADx. Appropriate trust—using correct or disregarding incorrect CADx predictions—was 48.7% (n = 244 of 501). Conclusions: Appropriate trust was common, and CADx prediction utilization was highest for the optional CADx without confidence scores. These results express the importance of a better understanding of human–artificial intelligence interaction.

Original languageEnglish
Pages (from-to)1070-1078-e10
Number of pages19
JournalGastrointestinal Endoscopy
Volume100
Issue number6
Early online date26 Jun 2024
DOIs
Publication statusPublished - Dec 2024

Funding

The following authors disclosed financial relationships: Q. E. W. van der Zander: Supported by Fujifilm Inc. to attend scientific meetings, outside the submitted work. A. A. M. Masclee: Supported by a health care efficiency grant from ZonMw, an unrestricted research grant from Will Pharma S.A., a restricted educational grant from Ferring B.V., a research grant from Pentax Europa, and research funding from Allegan and Gr\u00FCnenthal and gave scientific advice to Bayer, Kyowa Kirin, and Takeda, outside the submitted work. F. van der Sommen: Research support from Olympus, outside the submitted work. E. J. Schoon: Research support and speaker's fees from Fujifilm Inc, outside the submitted work. A. A. M. Masclee, F. van der Sommen, P. H. N. de With, and E. J. Schoon report a joint research grant from the Dutch Cancer Society for the submitted work. All other authors disclosed no financial relationships. The Dutch Cancer Society financially supported this study (project number 12639). The Dutch Cancer Society did not contribute to the study design, data collection, data analysis, data interpretation, writing of the manuscript, or in the decision to submit the paper for publication.

Keywords

  • Humans
  • Artificial Intelligence
  • Colonic Polyps/diagnosis
  • Diagnosis, Computer-Assisted/methods
  • Colonoscopy/methods
  • Trust
  • Male
  • Female
  • Algorithms
  • Colorectal Neoplasms/diagnosis
  • Precancerous Conditions/diagnosis
  • Middle Aged

Fingerprint

Dive into the research topics of 'Appropriate trust in artificial intelligence for the optical diagnosis of colorectal polyps: The role of human/artificial intelligence interaction'. Together they form a unique fingerprint.

Cite this