Abstract
Despite a poor reliability, peer assessment is the traditional method to assess the appropriateness of health care activities. This article describes the reliability of the human assessment of the appropriateness of diagnostic tests requests. The authors used a random selection of 1217 tests from 253 request forms submitted by general practitioners in the Maastricht region of the Netherlands. Three reviewers independently assessed the appropriateness of each requested test. Interrater kappa values ranged from 0.33 to 0.42, and kappa values of intrarater agreement ranged from 0.48 to 0.68. The joint reliability coefficient of the 3 reviewers was 0.66. This reliability is sufficient to review test ordering over a series of cases but is not sufficient to make case-by-case assessments. Sixteen reviewers are needed to obtain a joint reliability of 0.95. The authors conclude that there is substantial variation in assessment concerning what is an appropriately requested diagnostic test and that this feedback method is not reliable enough to make a case-by-case assessment. Computer support may be beneficial to support and make the process of peer review more uniform.
Original language | English |
---|---|
Pages (from-to) | 31-37 |
Number of pages | 7 |
Journal | Medical Decision Making |
Volume | 23 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2003 |