Validation of an Artificial Intelligence-Powered Virtual Assistant for Emergency Triage in Neurology

Alessandro, Lucas; Crema, Santiago; Castiglione, Juan Ignacio; Dossi, Daiana Elizabeth; Eberbach, Federico; Kohler, Alejandro Alfredo; Laffue, Alfredo Hernan; Marone, Abril; Nagel, Vanesa; Pastor Rueda, José Manuel; Varela, Francisco José; Fernández Slezak, Diego; Rodríguez Murúa, Sofía; Debasa, Carlos; Pensa, Claudio; Farez, Mauricio Franco

Validation of an Artificial Intelligence-Powered Virtual Assistant for Emergency Triage in Neurology

Alessandro, Lucas; Crema, Santiago; Castiglione, Juan Ignacio; Dossi, Daiana Elizabeth; Eberbach, Federico; Kohler, Alejandro Alfredo; Laffue, Alfredo Hernan; Marone, Abril; Nagel, Vanesa; Pastor Rueda, José Manuel; Varela, Francisco José; Fernández Slezak, Diego; Rodríguez Murúa, Sofía; Debasa, Carlos; Pensa, Claudio; Farez, Mauricio Franco

URI: https://doi.org/10.1097/NRL.0000000000000594
https://repositorio.fleni.org.ar/xmlui/handle/123456789/1308

Date: 2025-02-06

Abstract:

Objectives: Neurological emergencies pose significant challenges in medical care in resource-limited countries. Artificial intelligence (AI), particularly health chatbots, offers a promising solution. Rigorous validation is required to ensure safety and accuracy. Our objective is to evaluate the diagnostic safety and effectiveness of an AI-powered virtual assistant (VA) designed for the triage of neurological pathologies. Methods: The performance of an AI-powered VA for emergency neurological triage was tested. Ten patients over 18 years old with urgent neurological pathologies were selected. In the first stage, 9 neurologists assessed the safety of the VA using their clinical records. In the second stage, the assistant's accuracy when used by patients was evaluated. Finally, VA performance was compared with ChatGPT 3.5 and 4. Results: In stage 1, neurologists agreed with the VA in 98.5% of the cases for syndromic diagnosis, and in all cases, the definitive diagnosis was among the top 5 differentials. In stage 2, neurologists agreed with all diagnostic parameters and recommendations suggested by the assistant to patients. The average use time was 5.5 minutes (average of 16.5 questions). VA showed superiority over both versions of ChatGPT in all evaluated diagnostic and safety aspects (P<0.0001). In 57.8% of the evaluations, neurologists rated the VA as "excellent" (suggesting adequate utility). Conclusions: In this study, the VA showcased promising diagnostic accuracy and user satisfaction, bolstering confidence in further development. These outcomes encourage proceeding to a comprehensive phase 1/2 trial with 100 patients to thoroughly assess its "real-time" application in emergency neurological triage.

Show full item record