Figure - available from: Frontiers in Public Health
This content is subject to copyright.
ChatGPT-generated response exemplifying a potential occurrence of outdated information.

ChatGPT-generated response exemplifying a potential occurrence of outdated information.

Source publication
Article
Full-text available
Artificial intelligence (AI) chatbots have the potential to revolutionize online health information-seeking behavior by delivering up-to-date information on a wide range of health topics. They generate personalized responses to user queries through their ability to process extensive amounts of text, analyze trends, and generate natural language res...

Citations

... Though chatbots can be pre-trained on reputable cancer information, new treatment guidelines or newly published clinical trial data appear at a swift pace, meaning the system must be updated. If periodic re-training processes fail to incorporate the latest oncology standards, or if the chatbots inadvertently integrate unverified content, misinformation will gradually accumulate [39,40]. The impetus to operate chatbots in real-time on social media platforms further heightens this risk. ...
Article
Full-text available
The rapid integration of AI-driven chatbots into oncology education represents both a transformative opportunity and a critical challenge. These systems, powered by advanced language models, can deliver personalized, real-time cancer information to patients, caregivers, and clinicians, bridging gaps in access and availability. However, their ability to convincingly mimic human-like conversation raises pressing concerns regarding misinformation, trust, and their overall effectiveness in digital health communication. This review examines the dual-edged role of AI chatbots, exploring their capacity to support patient education and alleviate clinical burdens, while highlighting the risks of lack of or inadequate algorithmic opacity (i.e., the inability to see the data and reasoning used to make a decision, which hinders appropriate future action), false information, and the ethical dilemmas posed by human-seeming AI entities. Strategies to mitigate these risks include robust oversight, transparent algorithmic development, and alignment with evidence-based oncology protocols. Ultimately, the responsible deployment of AI chatbots requires a commitment to safeguarding the core values of evidence-based practice, patient trust, and human-centered care.
... Artificial intelligence (AI) models (such as large language models [LLMs]) demonstrate notable capabilities in natural language processing tasks in health care domain [12], surpassing previous state-of-the-art methods in areas like information retrieval [13][14][15], question answering [16][17][18], and text summarization [19][20][21]. Particularly via publicly available tools (e.g., ChatGPT), health information seeking and access start to shift from basic keyword-based searching to LLM-based augmented searches. ...
Article
Full-text available
Purpose Caregivers in pediatric oncology need accurate and understandable information about their child's condition, treatment, and side effects. This study assesses the performance of publicly accessible large language model (LLM)‐supported tools in providing valuable and reliable information to caregivers of children with cancer. Methods In this cross‐sectional study, we evaluated the performance of the four LLM‐supported tools—ChatGPT (GPT‐4), Google Bard (Gemini Pro), Microsoft Bing Chat, and Google SGE—against a set of frequently asked questions (FAQs) derived from the Children's Oncology Group Family Handbook and expert input (In total, 26 FAQs and 104 generated responses). Five pediatric oncology experts assessed the generated LLM responses using measures including accuracy, clarity, inclusivity, completeness, clinical utility, and overall rating. Additionally, the content quality was evaluated including readability, AI disclosure, source credibility, resource matching, and content originality. We used descriptive analysis and statistical tests including Shapiro–Wilk, Levene's, Kruskal–Wallis H‐tests, and Dunn's post hoc tests for pairwise comparisons. Results ChatGPT shows high overall performance when evaluated by the experts. Bard also performed well, especially in accuracy and clarity of the responses, whereas Bing Chat and Google SGE had lower overall scores. Regarding the disclosure of responses being generated by AI, it was observed less frequently in ChatGPT responses, which may have affected the clarity of responses, whereas Bard maintained a balance between AI disclosure and response clarity. Google SGE generated the most readable responses whereas ChatGPT answered with the most complexity. LLM tools varied significantly (p < 0.001) across all expert evaluations except inclusivity. Through our thematic analysis of expert free‐text comments, emotional tone and empathy emerged as a unique theme with mixed feedback on expectations from AI to be empathetic. Conclusion LLM‐supported tools can enhance caregivers' knowledge of pediatric oncology. Each model has unique strengths and areas for improvement, indicating the need for careful selection based on specific clinical contexts. Further research is required to explore their application in other medical specialties and patient demographics, assessing broader applicability and long‐term impacts.
... Artificial Intelligence (AI) models (such as Large Language Models (LLMs)) and their applications have demonstrated notable capabilities in natural language processing tasks, surpassing previous state-ofthe-art methods in areas like information retrieval 6 , question answering 7 , and medical text summarization 8 . These tools are increasingly being utilized to improve access to online medical information 9 . ...
Preprint
Full-text available
Background and Objectives: In pediatric oncology, caregivers seek detailed, accurate, and understandable information about their child's condition, treatment, and side effects. The primary aim of this study was to assess the performance of four publicly accessible large language model (LLM) supported knowledge generation and search tools in providing valuable and reliable information to caregivers of children with cancer. Methods: This cross-sectional study evaluated the performance of the four LLM-supported tools- ChatGPT (GPT-4), Google Bard (Gemini Pro), Microsoft Bing Chat, and Google SGE- against a set of frequently asked questions (FAQs) derived from the Children's Oncology Group Family Handbook and expert input. Five pediatric oncology experts assessed the generated LLM responses using measures including Accuracy (3-point ordinal scale), Clarity (3-point ordinal scale), Inclusivity (3-point ordinal scale), Completeness (Dichotomous nominal scale), Clinical Utility (5-point Likert-scale), and Overall Rating (4-point ordinal scale). Additional Content Quality Criteria such as Readability (ordinal scale; 5-18th grade of educated reading), Presence of AI Disclosure (Dichotomous scale), Source Credibility (3-point interval scale) , Resource Matching (3-point ordinal scale), and Content Originality (ratio scale) were also evaluated. We used descriptive analysis including the mean, standard deviation, median, and interquartile range. We conducted Shapiro-Wilk test for normality, Levene's test for homogeneity of variances, and Kruskal-Wallis H-Tests and Dunn's post-hoc tests for pairwise comparisons. Results: Through expert evaluation, ChatGPT showed high performance in accuracy (M=2.71, SD=0.235), clarity (M=2.73, SD=0.271), completeness (M=0.815, SD=0.203), Clinical Utility (M=3.81, SD=0.544), and Overall Rating (M=3.13, SD=0.419). Bard also performed well, especially in accuracy (M=2.56, SD=0.400) and clarity (M=2.54, SD=0.411), while Bing Chat (Accuracy M=2.33, SD=0.456; Clarity M=2.29, SD=0.424) and Google SGE (Accuracy M=2.08, SD=0.552; Clarity M=1.95, SD=0.541) had lower overall scores. The Presence of AI Disclosure was less frequent in ChatGPT (M=0.69, SD=0.46), which affected Clarity (M=2.73, SD=0.266), whereas Bard maintained a balance between AI Disclosure (M=0.92, SD=0.27) and Clarity (M=2.54, SD=0.403). Overall, we observed significant differences between LLM tools (p < .01). Conclusions: LLM-supported tools potentially contribute to caregivers' knowledge of pediatric oncology on related topics. Each model has unique strengths and areas for improvement, suggesting the need for careful selection and evaluation based on specific clinical contexts. Further research is needed to explore the application of these tools in other medical specialties and patient demographics to assess their broader applicability and long-term impacts, including the usability and feasibility of using LLM-supported tools with caregivers.
... The analysis of attitudes towards utilizing health information found online revealed that many participants integrate such information into their medical consultations, indicating a proactive approach to managing their health (34,35). However, healthcare providers should be prepared to address patients who reference online sources and offer appropriate guidance (36). Most individuals adhere to the advice and treatment prescribed by their healthcare providers despite accessing online resources, revealing a healthy equilibrium between online information consumption and trust in medical professionals (37). ...
Article
Full-text available
Background and aim of the work: E-health has generated benefits and challenges, such as spreading false information online. This study investigated the impact of online health research on the treatment and diagnosis of children aged 0 to 12 whose guardians belong to Generation Y, examining interference in the doctor-patient relationship and the influence of socioeconomic factors. The practical implications of these findings are significant, as they can guide communication strategies, provide reliable information, and promote collaboration between professionals and patients. Methods: A descriptive cross-sectional study carried out in a hospital in the interior of São Paulo between 2022 and 2023. Participants aged 32 to 42, responsible for children aged 0 to 12, answered a questionnaire on health research habits. We performed the statistical analysis using descriptive statistics, Spearman's correlation, and the Mann-Whitney test. Results: Our sample of 101 participants was predominantly female (80.2%) and heterogeneous in age, marital status, skin color, schooling, and occupation. The use of cell phones to access the internet was predominant (100%), and 34% always searched for health information online. Most found the information understandable (51.4%) and valuable (58.6%) but were cautious about its reliability. We found correlations between demographic data and the importance of different sources. Conclusions: The study contributes to understanding online health information-seeking habits and behaviors. (www.actabiomedica.it)
Article
Introduction Large language models (LLMs) such as ChatGPT can potentially transform the delivery of health information. This study aims to evaluate the accuracy and completeness of ChatGPT in responding to questions on recombinant zoster vaccination (RZV) in patients with rheumatic and musculoskeletal diseases. Methods A cross-sectional study was conducted using 20 prompts based on information from the Centers for Disease Control and Prevention (CDC), the Advisory Committee on Immunization Practices (ACIP), and the American College of Rheumatology (ACR). These prompts were inputted into ChatGPT 3.5. Five rheumatologists independently scored the ChatGPT responses for accuracy (Likert 1 to 5) and completeness (Likert 1 to 3) compared with validated information sources (CDC, ACIP, and ACR). Results The overall mean accuracy of ChatGPT responses on a 5-point scale was 4.04, with 80% of responses scoring ≥4. The mean completeness score of ChatGPT response on a 3-point scale was 2.3, with 95% of responses scoring ≥2. Among the 5 raters, ChatGPT unanimously scored with high accuracy and completeness to various patient and physician questions surrounding RZV. There was one instance where it scored with low accuracy and completeness. Although not significantly different, ChatGPT demonstrated the highest accuracy and completeness in answering questions related to ACIP guidelines compared with other information sources. Conclusions ChatGPT exhibits promising ability to address specific queries regarding RZV for rheumatic and musculoskeletal disease patients. However, it is essential to approach ChatGPT with caution due to risk of misinformation. This study emphasizes the importance of rigorously validating LLMs as a health information source.