Evaluation of ChatGPT-4o’s answers to questions about hip arthroscopy from the patient perspective

Gökhan Ayık; Niyazi Ercan; Yunus Demirtaş; Tuğrul Yıldırım; Gökhan Çakmak

doi:10.52312/jdrs.2025.1961

Gökhan Ayık, Niyazi Ercan, Yunus Demirtaş, Tuğrul Yıldırım, Gökhan Çakmak

Department of Orthopedics and Traumatology, Yüksek İhtisas University, Ankara, Türkiye

Keywords: Artificial intelligence, ChatGPT-4o, hip arthroscopy, patient education.

Abstract

Objectives: This study aimed to evaluate the responses provided by ChatGPT-4o to the most frequently asked questions by patients regarding hip arthroscopy.

Materials and methods: In this cross-sectional survey study, a new Google account without a search history was created to determine the 20 most frequently asked questions about hip arthroscopy via Google. These questions were asked to a new ChatGPT-4o account on June 1, 2024, and the responses were recorded. Ten orthopedic surgeons specializing in sports surgery rated the responses using a rating scale to assess relevance, accuracy, clarity, and completeness. The responses were scored on a scale from 1 to 5, with 1 being the worst and 5 being the best. The interrater reliability assessed via the intraclass correlation coefficient (ICC).

Results: The lowest score given by the surgeons for any response was 4/5 in each subcategory. The highest mean scores were in accuracy and clarity, followed by relevance, with completeness receiving the lowest scores. The overall mean score was 4.49±0.16. Interrater reliability showed insufficient overall agreement (ICC=0.004, p=0.383), with the highest agreement in clarity (ICC=0.039, p=0.131) and the lowest in accuracy (ICC=–0.019, p=0.688).

Conclusion: The study confirms our hypothesis that ChatGPT-4o provides above-average quality responses to frequently asked questions about hip arthroscopy, as evidenced by the high scores in relevance, accuracy, clarity, and completeness. However, it is still advisable to consult orthopedic specialists on the subject, incorporating ChatGPT's suggestions during the final decision-making process.

Citation: Ayık G, Ercan N, Demirtaş Y, Yıldırım T, Çakmak G. Evaluation of ChatGPT-4o's answers to questions about hip arthroscopy from the patient perspective. Jt Dis Relat Surg 2025;36(1):193-199. doi: 10.52312/jdrs.2025.1961.

Author Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed: G.A., N.E., T.Y. The first draft of the manuscript was written by G.A. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Conflict of Interest

The authors declared no conflicts of interest with respect to the authorship and/or publication of this article.

Financial Disclosure

The authors received no financial support for the research and/or authorship of this article.

Data Sharing Statement

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.