Publication:
Who Is More Successful in a Spinal Surgery Examination? ChatGPT-3.5/4.0 or a Resident Doctor

dc.contributor.authorKaya, Ozcan
dc.contributor.authorDincer, Recep
dc.contributor.authorCoskun, Huseyin Sina
dc.contributor.authorKarapınar, Sefa Erdem
dc.date.accessioned2025-12-11T01:44:12Z
dc.date.issued2025
dc.departmentOndokuz Mayıs Üniversitesien_US
dc.department-tempSağlık Bilimleri Üniversitesi,Süleyman Demirel Üniversitesi,Ondokuz Mayıs Üniversitesi,Süleyman Demirel Üniversitesien_US
dc.description.abstractObjective: As in all work sectors, artificial intelligence (AI) is now often used and has increased especially in the field of medicine with advances in technology. The aim of this study was to compare the responses given by Chat Generative Pre-trained Transformer (ChatGPT)-4.0, ChatGPT-3.5, and orthopaedics and traumatology residents to the Turkish Orthopedics and Traumatology Education Council (TOTEK) questions about the spine. Materials and Methods: A total of 15 residents in the orthopaedics and traumatology clinic of a tertiary-level university hospital participated in an examination consisting of questions only related to the spine. The same questions were asked to ChatGPT-3.5 and ChatGPT-4.0 on two different days. The examination consisted of true/false questions, theoretical/classical and diagram/visual sections, with each section scored from 100 points. The average score was calculated and the results were evaluated by two instructors. Results: The mean score obtained was 72.88 for ChatGPT-3.5 (p=0.005) and 69.38 for Chat GPT-4.0 (p=0.001), showing a 5.87% difference in success. The mean score obtained by the orthopaedic residents was 69.90 (p=0.779). Both the 3.5 and 4.0 versions of ChatGPT AI were observed to have a knowledge level equivalent to that of a 3rd year resident. Conclusion: The 4th and 5th year orthopaedic residents were able to answer more questions correctly than ChatGPT-3.5 and GPT-4 on the spine assessment questions. Both ChatGPT-3.5 and GPT-4 performed better on text-only questions than on visual questions. It is unlikely that GPT-4 or ChatGPT-3.5 would pass the TOTEK written examination.en_US
dc.identifier.doi10.4274/jtss.galenos.2025.15870
dc.identifier.endpage91en_US
dc.identifier.issn2147-5903
dc.identifier.issue2en_US
dc.identifier.scopusqualityQ4
dc.identifier.startpage88en_US
dc.identifier.trdizinid1316544
dc.identifier.urihttps://doi.org/10.4274/jtss.galenos.2025.15870
dc.identifier.urihttps://search.trdizin.gov.tr/en/yayin/detay/1316544/who-is-more-successful-in-a-spinal-surgery-examination-chatgpt-3540-or-a-resident-doctor
dc.identifier.urihttps://hdl.handle.net/20.500.12712/45687
dc.identifier.volume36en_US
dc.language.isoenen_US
dc.relation.ispartofJournal of Turkish Spinal Surgeryen_US
dc.relation.publicationcategoryMakale - Ulusal Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.titleWho Is More Successful in a Spinal Surgery Examination? ChatGPT-3.5/4.0 or a Resident Doctoren_US
dc.typeArticleen_US
dspace.entity.typePublication

Files