Comparison of ChatGPT and Google in addressing patients’ questions on robot-assisted total hip arthroplasty
Mustafa Fatih Dasci1
, Serkan Surucu2
, Furkan Aral3
, Mahmud Aydin4
, Cihangir Turemis5
, N Amir Sandiford6
, Mustafa Citak7
1Department of Orthopedics and Traumatology, University of Health Science, Bağcılar Training and Research Hospital, İstanbul, Türkiye
2Department of Orthopaedics and Rehabilitation, Yale University, New Haven, USA
3Department of Orthopedics and Traumatology, Gazi University Faculty of Medicine, Ankara, Türkiye
4Department of Orthopedics and Traumatology, Memorial Şişli Hospital, İstanbul, Türkiye
5Department of Orthopedics and Traumatology, Çeşme Alper Çizgenat State Hospital, İzmir, Türkiye
6Joint Reconstruction Unit, Southland Hospital, University of Otago, Invercargill, New Zealand
7Department of Orthopaedic Surgery, HELIOS ENDO-Klinik Hamburg, Hamburg, Germany
Keywords: Artificial intelligence, ChatGPT, clinical relevance, Google, health information quality, robot-assisted total hip arthroplasty, patient education.
Abstract
Objectives: This study aims to compare ChatGPT (Generative Pre-Trained Transformer) and Google in addressing frequently asked questions (FAQs), answers, and online sources regarding robot-assisted total hip arthroplasty (RATHA).
Materials and methods: On December 15th, 2024, the 20 most FAQs were identified by inputting the search term “Robot-Assisted Total Hip Replacement” into both Google Search and ChatGPT-4o. Twenty FAQs were independently identified using a clean Google search and a prompt to ChatGPT-4o. The FAQs on Google were sourced from the "People also ask" section, while ChatGPT was requested to generate the 20 most often asked questions. All questions, answers, and references cited were recorded. A modified version of the Rothwell system was used to categorize questions into 10 subtopics: special activities, timeline of recovery, restrictions, technical details, cost, indications/management, risks and complications, pain, longevity, and evaluation of surgery. Each reference was categorized into the following groups: commercial, academic, medical practice, single surgeon personal, or social media. Responses were also graded as “excellent response not requiring clarification” (1), “satisfactory requiring minimal clarification” (2), “satisfactory requiring moderate clarification” (3), or “unsatisfactory requiring substantial clarification” (4).
Results: Overall, 20% of the questions that Google and ChatGPT-4o considered as the most FAQ were similar to each other. Technical details (35%) were the most common categories of questions. The ChatGPT provided significantly more academic references than Google search (70% vs. 20%, p=0.0113). Conversely, Google web search cited more medical practice references (40% vs. 0%, p=0.0033), single surgeon websites (20% vs. 0%, p=0.1060), and government websites (10% vs. 0%, p=0.4872) more frequently than ChatGPT. In terms of response quality, 62% of answers were rated as Grade 1-2 (excellent or satisfactory with minimal clarification), while 38% required moderate or substantial clarification (Grades 3-4).
Conclusion: ChatGPT demonstrated comparable results to those of Google searches on information regarding RATHA, with a higher reliance on academic sources. While most responses were satisfactory, a notable proportion required further clarification, emphasizing the need for continued evaluation of these platforms to ensure accuracy and reliability in patient education. Taken together, these technologies have the capacity to enhance health literacy and provide enhanced shared decision-making for patients seeking information on RATHA.
Citation: Dasci MF, Surucu S, Aral F, Aydin M, Turemis C, Sandiford NA, et al. Comparison of ChatGPT and Google in addressing patients’ questions on robot-assisted total hip arthroplasty. Jt Dis Relat Surg 2026;37(1):i-xiv. doi: 10.52312/jdrs.2026.2368.
