Evaluation of ChatGPT-4o’s answers to questions about hip arthroscopy from the patient perspective
Gökhan Ayık, Niyazi Ercan, Yunus Demirtaş, Tuğrul Yıldırım, Gökhan Çakmak
Department of Orthopedics and Traumatology, Yüksek İhtisas University, Ankara, Türkiye
Keywords: Artificial intelligence, ChatGPT-4o, hip arthroscopy, patient education.
Abstract
Objectives: This study aimed to evaluate the responses provided by ChatGPT-4o to the most frequently asked questions by patients regarding hip arthroscopy.
Materials and methods: In this cross-sectional survey study, a new Google account without a search history was created to determine the 20 most frequently asked questions about hip arthroscopy via Google. These questions were asked to a new ChatGPT-4o account on June 1, 2024, and the responses were recorded. Ten orthopedic surgeons specializing in sports surgery rated the responses using a rating scale to assess relevance, accuracy, clarity, and completeness. The responses were scored on a scale from 1 to 5, with 1 being the worst and 5 being the best. The interrater reliability assessed via the intraclass correlation coefficient (ICC).
Results: The lowest score given by the surgeons for any response was 4/5 in each subcategory. The highest mean scores were in accuracy and clarity, followed by relevance, with completeness receiving the lowest scores. The overall mean score was 4.49±0.16. Interrater reliability showed insufficient overall agreement (ICC=0.004, p=0.383), with the highest agreement in clarity (ICC=0.039, p=0.131) and the lowest in accuracy (ICC=–0.019, p=0.688).
Conclusion: The study confirms our hypothesis that ChatGPT-4o provides above-average quality responses to frequently asked questions about hip arthroscopy, as evidenced by the high scores in relevance, accuracy, clarity, and completeness. However, it is still advisable to consult orthopedic specialists on the subject, incorporating ChatGPT's suggestions during the final decision-making process.
Citation: Ayık G, Ercan N, Demirtaş Y, Yıldırım T, Çakmak G. Evaluation of ChatGPT-4o's answers to questions about hip arthroscopy from the patient perspective. Jt Dis Relat Surg 2025;36(1):193-199. doi: 10.52312/jdrs.2025.1961.
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed: G.A., N.E., T.Y. The first draft of the manuscript was written by G.A. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
The authors declared no conflicts of interest with respect to the authorship and/or publication of this article.
The authors received no financial support for the research and/or authorship of this article.
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.