Manuscript submitted May 6, 2025; accepted May 20, 2025; published July 17, 2025
Abstract—The Advanced Trauma Life Support (ATLS) certification evaluates the ability of medical professionals to manage trauma patients effectively in emergency settings. With the rapid evolution of Large Language Models (LLMs), there is growing interest in exploring how these tools might integrate into clinical practice. This study assessed the performance of three LLMs—GPT-3.5, Gemini, and GPT-4—on the ATLS written examinations. Each model answered three different ATLS 10th edition exams. Their responses were compared to official answer keys, and average scores were calculated. Differences in performance among the LLMs were analyzed using chi-square testing. In addition, performance was examined based on question type: direct knowledge questions versus clinical scenario questions. GPT-3.5 achieved an average score of 65%, Gemini 61.7%, and GPT-4 83.3%. Among the three models, only GPT-4 surpassed the passing threshold of 75%. There was no statistically significant difference between the scores of GPT-3.5 and Gemini (p = 0.59). However, GPT-4 significantly outperformed both GPT-3.5 (p = 0.0012) and Gemini (p = 0.0002). No significant differences in performance were noted between direct and clinical scenario questions within each model. GPT-4 demonstrated the ability to successfully pass the ATLS examination, highlighting its advanced technical knowledge. Nonetheless, occasional inaccuracies or "hallucinations" were observed, particularly with more complex questions. With continued development and rigorous validation, LLMs like GPT-4 have the potential to serve as valuable adjuncts in clinical decision-making and trauma education.
keywords—artificial intelligence, Advanced Trauma Life Support (ATLS), ChatGpt, Google, trauma
Cite: Hilary Y. Liu, Mario Alessandri Bonetti, Alain C. Corcos, Jenny A. Ziembicki, Francesco M. Egro,"GPT-3.5, Gemini, and GPT-4 Performance on the Advanced Trauma Life Support Exam," Journal of Advances in Artificial Intelligence, vol. 3, no. 3, pp. 180-186, 2025. doi: 10.18178/JAAI.2025.3.3.180-186
Copyright © 2025 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
Copyright © 2023-2025. Journal of Advances in Artificial Intelligence. All rights reserved.
E-mail: editor@jaai.net