Abstract:
Purpose: To investigate the diagnostic performance of an Artificial Intelligence (AI) system for detection of COVID-19 in chest radiographs (CXR), and compare results to those of physicians working alone, or with AI support.
Materials and methods: An AI system was fine-tuned to discriminate confirmed COVID-19 pneumonia, from other viral and bacterial pneumonia and non-pneumonia patients and used to review 302 CXR images from adult patients retrospectively sourced from nine different databases. Fifty-four physicians blind to diagnosis, were invited to interpret images under identical conditions in a test set, and randomly assigned either to receive or not receive support from the AI system. Comparisons were then made between diagnostic performance of physicians working with and without AI support. AI system performance was evaluated using the area under the receiver operating characteristic (AUROC), and sensitivity and specificity of physician performance compared to that of the AI system.
Results: Discrimination by the AI system of COVID-19 pneumonia showed an AUROC curve of 0.96 in the validation and 0.83 in the external test set, respectively. The AI system outperformed physicians in the AUROC overall (70% increase in sensitivity and 1% increase in specificity, p < 0.0001). When working with AI support, physicians increased their diagnostic sensitivity from 47% to 61% (p < 0.001), although specificity decreased from 79% to 75% (p = 0.007).
Conclusions: Our results suggest interpreting chest radiographs (CXR) supported by AI, increases physician diagnostic sensitivity for COVID-19 detection. This approach involving a human-machine partnership may help expedite triaging efforts and improve resource allocation in the current crisis.