PLOS

PLOS Digital Health

Published by PLOS

Online ISSN: 2767-3170

Journal websiteAuthor guidelines

Top-read articles

2,343 reads in the past 30 days

Proposed mediation model to explain the association between Internet Addiction, Emotional Intelligence and Mental Health
Proposed mediation model to explain the association between dimensions of Internet Addiction, Emotional Intelligence and Mental Health
Normality distribution of: (Fig 3A) Emotional intelligence on Mental Health, (Fig 3B) Internet Addiction on Mental Health, (Fig 3C) Internet Addiction on Emotional Intelligence, and (Fig 3D) Internet Addiction on Emotional Intelligence and mental Health.
Confirmatory Factor Analysis model of Internet Addiction Scale
Confirmatory Factor Analysis model of Emotional Intelligence Scale

+10

Investigating the mediating role of emotional intelligence in the relationship between internet addiction and mental health among university students

November 2024

·

6,806 Reads

·

2 Citations

·

·

Seid Dawed

·

[...]

·

Download

Aims and scope


PLOS Digital Health is a journal for democratizing healthcare in the interdisciplinary and digital age. We encourage excellent research from healthcare professionals, policy makers, and other stakeholders across digital health that embraces open code and data sharing to improve health outcomes for patients. We empower researchers to ensure that progress and technology in this rapidly-changing field is discoverable, accessible, and reproducible.

Recent articles


Illustration of the trade-off between performance-driven result and group fairness-driven result in the context of organ allocation, where gender is the sensitive attribute
of biases across each stage of the machine learning (ML) pipeline
AI-driven healthcare: Fairness in AI healthcare: A survey
  • Literature Review
  • Full-text available

May 2025

·

6 Reads

Artificial intelligence (AI) is rapidly advancing in healthcare, enhancing the efficiency and effectiveness of services across various specialties, including cardiology, ophthalmology, dermatology, emergency medicine, etc. AI applications have significantly improved diagnostic accuracy, treatment personalization, and patient outcome predictions by leveraging technologies such as machine learning, neural networks, and natural language processing. However, these advancements also introduce substantial ethical and fairness challenges, particularly related to biases in data and algorithms. These biases can lead to disparities in healthcare delivery, affecting diagnostic accuracy and treatment outcomes across different demographic groups. This review paper examines the integration of AI in healthcare, highlighting critical challenges related to bias and exploring strategies for mitigation. We emphasize the necessity of diverse datasets, fairness-aware algorithms, and regulatory frameworks to ensure equitable healthcare delivery. The paper concludes with recommendations for future research, advocating for interdisciplinary approaches, transparency in AI decision-making, and the development of innovative and inclusive AI applications.


Designing a computer-assisted diagnosis system for cardiomegaly detection and radiology report generation

Chest X-ray (CXR) is a diagnostic tool for cardiothoracic assessment. They make up 50% of all diagnostic imaging tests. With hundreds of images examined every day, radiologists can suffer from fatigue. This fatigue may reduce diagnostic accuracy and slow down report generation. We describe a prototype computer-assisted diagnosis (CAD) pipeline employing computer vision (CV) and Natural Language Processing (NLP). It was trained and evaluated on the publicly available MIMIC-CXR dataset. We perform image quality assessment, view labelling, and segmentation-based cardiomegaly severity classification. We use the output of the severity classification for large language model-based report generation. Four board-certified radiologists assessed the output accuracy of our CAD pipeline. Across the dataset composed of 377,100 CXR images and 227,827 free-text radiology reports, our system identified 0.18% of cases with mixed-sex mentions, 0.02% of poor quality images (F1 = 0.81), and 0.28% of wrongly labelled views (accuracy 99.4%). We assigned views for 4.18% of images which have unlabelled views. Our binary cardiomegaly classification model has 95.2% accuracy. The inter-radiologist agreement on evaluating the generated report’s semantics and correctness for radiologist-MIMIC is 0.62 (strict agreement) and 0.85 (relaxed agreement) similar to the radiologist-CAD agreement of 0.55 (strict) and 0.93 (relaxed). Our work found and corrected several incorrect or missing metadata annotations for the MIMIC-CXR dataset. The performance of our CAD system suggests performance on par with human radiologists. Future improvements revolve around improved text generation and the development of CV tools for other diseases.


A - Cardiopulmonary exercise testing (CPET)
On the left, the different phases of CPET are detailed, including Rest, Ventilatory Anaerobic Threshold (AT), Peak, and Recovery, describing the exercise intensity levels and physiological responses. The central graph displays the progression of exercise intensity, marked by the green ramp, alongside traces of oxygen uptake (V̇O2) and carbon dioxide production (V̇CO2). The right section elucidates primary features derived from CPET such as V̇O2, V̇CO2, heart rate (HR), oxygen pulse (V̇O2/HR), ventilation (VE), and the respiratory exchange ratio (RER). The image portrays a patient undergoing CPET, equipped with the appropriate testing apparatus. The graph is presented with the consent of Quick O, Reed-Poysden C, from their work “Cardiopulmonary Exercise Test: Interpretation and Application in Perioperative Medicine” 2022, https://doi.org/10.28923/atotw.473. B - Bar chart illustrating the average number of minor and major components contributing to each Postoperative Morbidity Survey (POMS) score on day 3. The bars represent the mean count of individual minor (blue) and major (orange) morbidity factors for patients with POMS scores ranging from 0 to 6. C - Scatter plot with linear regression analysis illustrating the relationship between patient age and Postoperative Morbidity Survey (POMS) scores on day 3. Each datapoint represents an individual patient’s age against their corresponding POMS score. D - Scatter plot with linear regression analysing the relationship between V̇O2/Kg at the Anaerobic Threshold (AT) and Postoperative Morbidity Survey (POMS) scores on day 3. Each datapoint signifies the V̇O2/Kg AT value for an individual patient and their respective POMS score.
A - Receiver operating characteristic (ROC) curve comparison for the Multi-objective Symbolic Regression (MOSR) model’s performance in predicting Postoperative Morbidity Survey (POMS) scores at day 3
The ROC curves of three models are presented: MOSR using Cardiorespiratory Fitness (CRF) dataset (blue), MOSR using Clinical dataset (orange), and MOSR utilizing a combination of both CRF and Clinical dataset (green). B - SHAP Beeswarm Plot of MOSR models using Cardiorespiratory Fitness (CRF) and Clinical database. Each row represents a feature used in the model. Each dot on a row corresponds to a datapoint, with its position on the x-axis indicating the SHAP value, or the contribution of that feature to the model’s prediction for that datapoint. The colour of the dots represents the value of that feature, with blue indicating low values and red indicating high values. V̇O2/Kg AT: Oxygen Consumption per Kilogram at Anaerobic Threshold, V̇O2/Kg VOP: Oxygen Consumption per Kilogram at Peak, VE/V̇CO2 AT: Ventilatory Equivalent for Carbon Dioxide at Anaerobic Threshold, BMI: Body Mass Index, V̇O2/Kg Rest = Oxygen Consumption per Kilogram at Rest, MET = Metabolic Equivalent of Task, V̇O2 AT = Oxygen Consumption at Anaerobic Threshold, HR AT = Heart Rate at Anaerobic Threshold, VT VOP = Tidal Volume at Peak, VE VOP = Ventilation at Peak, V̇O2/HR AT = Oxygen Consumption per Heart Rate at Anaerobic Threshold, V̇CO2 AT = Carbon Dioxide Production at Anaerobic Threshold, PetCO2 VOP = Partial End-tidal Carbon Dioxide at Peak, WR AT = Work Rate at Anaerobic Threshold, VE AT = Ventilation at Anaerobic Threshold, WR VOP = Work Rate at Peak, PetCO2 Rest = Partial End-tidal Carbon Dioxide at Rest, RR VOP = Respiratory Rate at Peak. C - Receiver Operating Characteristic (ROC) curves comparing the Multi-objective Symbolic Regression (MOSR), utilizing either CRT data alone or with Clinical data, against established clinical assessment scores in predicting Postoperative Morbidity Survey (POMS) outcomes on day 3. MOSR CRF dataset model (blue), Cardiopulmonary Exercise Testing (CPET) (red), American Society of Anaesthesiologists Score (ASA) (yellow), Physiological and Operative Severity Score for the enumeration of Mortality and morbidity (PPOSSUM) (purple), and Duke Activity Status Index (DASI) (green). D–F - Receiver Operating Characteristic (ROC) curves comparing the Multi-objective Symbolic Regression (MOSR) model, against DecisionTree Classifier, LGBM Classifier, Logistic Regression, Random Forest Classifier, and XGB Classifier. Panel D shows ROC curves for models using only Clinical dataset, Panel E displays ROC curves for models using only Cardiorespiratory Fitness dataset and Panel F presents ROC curves for models using both CRF and Clinical data.
A – Comparison of 585 time-series showing oxygen uptake per kilogram per minute (V̇O2/Kg/min) in cardiopulmonary exercise testing (CPET)
The x-axis shows the moment of the exam, while the y-axis displays the V̇O2/Kg/min values measured in ml/kg/min. Patients are categorized and color-coded based on their Postoperative Morbidity Survey (POMS) scores: those with POMS scores equal or less than 1 are represented in light blue, POMS scores exactly at or more than 2 in orange, and POMS scores greater than 3 in red. B – Illustration of the effect of resampling a data track on its resolution and detail. The top left plot shows the original track with high-resolution data points, depicted in blue. The other three plots demonstrate the track after resampling to different numbers of points, with each plot color-coded to represent a specific resampling: 1000 points in green (top right), 500 points in orange (bottom left), and 100 points in red (bottom right). C - Receiver Operating Characteristic (ROC) curves comparing the Multi-objective Symbolic Regression (MOSR) model against other machine learning classifiers: Decision Tree, LGBM Classifier, Logistic Regression, Random Forest Classifier, and XGB Classifier. Each curve represents the respective model’s performance in predicting day 3 Postoperative Morbidity Survey (POMS) scores using Cardiorespiratory Fitness Time Series (CRF-TS) dataset. D - Receiver Operating Characteristic (ROC) curves comparing the Multi-objective Symbolic Regression (MOSR) model against other machine learning classifiers: Decision Tree, LGBM Classifier, Logistic Regression, Random Forest Classifier, and XGB Classifier. Each curve represents the respective model’s performance in predicting day 3 Postoperative Morbidity Survey (POMS) scores using the subset of patient used in the Time-Series experiment from the Cardiorespiratory Fitness (CFR 585) dataset.
Population demographics, comorbidities, laboratory variables, medications and surgery types, cardiopulmonary exercise test (CPET) values and perioperative outcomes
Machine learning model performances
Assessing perioperative risks in a mixed elderly surgical population using machine learning: A multi-objective symbolic regression approach to cardiorespiratory fitness derived from cardiopulmonary exercise testing

May 2025

·

39 Reads

Accurate preoperative risk assessment is of great value to both patients and clinical teams. Several risk scores have been developed but are often not calibrated to the local institution, limited in terms of data input into the underlying models, and/or lack individual precision. Machine Learning (ML) models have the potential to address limitations in existing scoring systems. A database of 1190 elderly patients who underwent major elective surgery was analyzed retrospectively. Preoperative cardiorespiratory fitness data from cardiopulmonary exercise testing (CPET), demographic and clinical data were extracted and integrated into advanced machine learning (ML) algorithms. Multi-Objective-Symbolic-Regression (MOSR), a novel algorithm utilizing Genetic Programming to generate mathematical formulae for learning tasks, was employed to predict patient morbidity at Postoperative Day 3, as defined by the PostOperative Morbidity Survey (POMS). Shapley-Additive-exPlanations (SHAP) was subsequently used to analyze feature contributions. Model performance was benchmarked against existing risk prediction scores, namely the Portsmouth-Physiological-and-Operative-Severity-Score-for-the-Enumeration-of-Mortality-and-Morbidity (PPOSSUM) and the Duke-Activity-Status-Index, as well as linear regression using CPET features. A model was also developed for the same task using data directly extracted from the CPET time-series. The incorporation of cardiorespiratory fitness data enhanced the performance of all models for predicting postoperative morbidity by 20% compared to sole reliance on clinical data. Cardiorespiratory fitness features demonstrated greater importance than clinical features in the SHAP analysis. Models utilizing data taken directly from the CPET time-series demonstrated a 12% improvement over the cardiorespiratory fitness models. MOSR model surpassed all other models in every experiment, demonstrating excellent robustness and generalization capabilities. Integrating cardiorespiratory fitness data with ML models enables improved preoperative prediction of postoperative morbidity in elective surgical patients. The MOSR model stands out for its capacity to pinpoint essential features and build models that are both simple and accurate, showing excellent generalizability.


A non-specialist worker delivered digital assessment of cognitive development (DEEP) in young children: A longitudinal validation study in rural India

May 2025

·

16 Reads

Cognitive development in early childhood is critical for life-long well-being. Existing cognitive development surveillance tools require lengthy parental interviews and observations of children. Developmental Assessment on an E-Platform (DEEP) is a digital tool designed to address this gap by providing a gamified, direct assessment of cognition in young children which can be delivered by front-line providers in community settings. This longitudinal study recruited children from the SPRING trial in rural Haryana, India. DEEP was administered at 39 (SD 1; N = 1359), 60 (SD 5; N = 1234) and 95 (SD 4; N = 600) months and scores were derived using item response theory. Criterion validity was examined by correlating DEEP-score with age, Bayley’s Scales of Infant Development (BSID-III) cognitive domain score at age 3 and Raven’s Coloured Progressive Matrices (CPM) at age 8; predictive validity was examined by correlating DEEP-scores at preschool-age with academic performance at age 8 and convergent validity through correlations with height-for-age z-scores (HAZ), socioeconomic status (SES) and early life adversities. DEEP-score correlated strongly with age (r = 0.83, 95% CI 0.82 0.84) and moderately with BSID-III (r = 0.50, 0.39 – 0.60) and CPM (r = 0.37; 0.30 – 0.44). DEEP-score at preschool-age predicted academic outcomes at school-age (0.32; 0.25 – 0.41) and correlated positively with HAZ and SES and negatively with early life adversities. DEEP provides a valid, scalable method for cognitive assessment. It’s integration into developmental surveillance programs could aid in monitoring and early detection of cognitive delays, enabling timely interventions.


Utilizing process mining in quality management: A case study in radiation oncology

Radiation oncology is known for its complexity, inherent risks, and sheer volume of data. Adopting a process-oriented management approach and systemic thinking is essential for ensuring safety, efficiency, and the highest quality of care. Process mining offers a data-centric method for analyzing and improving clinical workflows to ensure optimal patient outcomes. This study utilizes process mining techniques along with a quality management system to analyze event logs obtained from an electronic medical record system. Conformance checking and process improvement methodologies were utilized to detect inefficiencies and bottlenecks. Examining the treatment planning process through process mining revealed two principal bottlenecks—OAR contouring and physics chart checks. This led to specific interventions that markedly decreased the time to complete treatment planning processes. Additionally, applying organizational mining methods provided valuable information on how resources are utilized and how teams collaborate within the organization. Process mining is a useful tool for improving efficiency, quality, and decision-making in radiation oncology. By transitioning from traditional management to a data-driven leadership approach, radiation oncology departments can optimize workflows, enhance patient care, and adapt to the evolving demands of modern healthcare.


Development and validation of an AI algorithm to generate realistic and meaningful counterfactuals for retinal imaging based on diffusion models

May 2025

·

1 Citation

Counterfactual reasoning is often used by humans in clinical settings. For imaging based specialties such as ophthalmology, it would be beneficial to have an AI model that can create counterfactual images, illustrating answers to questions like “If the subject had had diabetic retinopathy, how would the fundus image have looked?”. Such an AI model could aid in training of clinicians or in patient education through visuals that answer counterfactual queries. We used large-scale retinal image datasets containing color fundus photography (CFP) and optical coherence tomography (OCT) images to train ordinary and adversarially robust classifiers that classify healthy and disease categories. In addition, we trained an unconditional diffusion model to generate diverse retinal images including ones with disease lesions. During sampling, we then combined the diffusion model with classifier guidance to achieve realistic and meaningful counterfactual images maintaining the subject’s retinal image structure. We found that our method generated counterfactuals by introducing or removing the necessary disease-related features. We conducted an expert study to validate that generated counterfactuals are realistic and clinically meaningful. Generated color fundus images were indistinguishable from real images and were shown to contain clinically meaningful lesions. Generated OCT images appeared realistic, but could be identified by experts with higher than chance probability. This shows that combining diffusion models with classifier guidance can achieve realistic and meaningful counterfactuals even for high-resolution medical images such as CFP images. Such images could be used for patient education or training of medical professionals.


From manual clinical criteria to machine learning algorithms: Comparing outcome endpoints derived from diverse electronic health record data modalities

May 2025

·

12 Reads

Background Progression free survival (PFS) is a critical clinical outcome endpoint during cancer management and treatment evaluation. Yet, PFS is often missing from publicly available datasets due to the current subjective, expert, and time-intensive nature of generating PFS metrics. Given emerging research in multi-modal machine learning (ML), we explored the benefits and challenges associated with mining different electronic health record (EHR) data modalities and automating extraction of PFS metrics via ML algorithms. Methods We analyzed EHR data from 92 pathology-proven GBM patients, obtaining 233 corticosteroid prescriptions, 2080 radiology reports, and 743 brain MRI scans. Three methods were developed to derive clinical PFS: 1) frequency analysis of corticosteroid prescriptions, 2) natural language processing (NLP) of reports, and 3) computer vision (CV) volumetric analysis of imaging. Outputs from these methods were compared to manually annotated clinical guideline PFS metrics. Results Employing data-driven methods, standalone progression rates were 63% (prescription), 78% (NLP), and 54% (CV), compared to the 99% progression rate from manually applied clinical guidelines using integrated data sources. The prescription method identified progression an average of 5.2 months later than the clinical standard, while the CV and NLP algorithms identified progression earlier by 2.6 and 6.9 months, respectively. While lesion growth is a clinical guideline progression indicator, only half of patients exhibited increasing contrast-enhancing tumor volumes during scan-based CV analysis. Conclusion Our results indicate that data-driven algorithms can extract tumor progression outcomes from existing EHR data. However, ML methods are subject to varying availability bias, supporting contextual information, and pre-processing resource burdens that influence the extracted PFS endpoint distributions. Our scan-based CV results also suggest that the automation of clinical criteria may not align with human intuition. Our findings indicate a need for improved data source integration, validation, and revisiting of clinical criteria in parallel to multi-modal ML algorithm development.


An inherently interpretable AI model improves screening speed and accuracy for early diabetic retinopathy

May 2025

·

10 Reads

Diabetic retinopathy (DR) is a frequent complication of diabetes, affecting millions worldwide. Screening for this disease based on fundus images has been one of the first successful use cases for modern artificial intelligence in medicine. However, current state-of-the-art systems typically use black-box models to make referral decisions, requiring post-hoc methods for AI-human interaction and clinical decision support. We developed and evaluated an inherently interpretable deep learning model, which explicitly models the local evidence of DR as part of its network architecture, for clinical decision support in early DR screening. We trained the network on 34,350 high-quality fundus images from a publicly available dataset and validated its performance on a large range of ten external datasets. The inherently interpretable model was compared to post-hoc explainability techniques applied to a standard DNN architecture. For comparison, we obtained detailed lesion annotations from ophthalmologists on 65 images to study if the class evidence maps highlight clinically relevant information. We tested the clinical usefulness of our model in a retrospective reader study, where we compared screening for DR without AI support to screening with AI support with and without AI explanations. The inherently interpretable deep learning model obtained an accuracy of .906 [.900–.913] (95%-confidence interval) and an AUC of .904 [.894–.913] on the internal test set and similar performance on external datasets, comparable to the standard DNN. High evidence regions directly extracted from the model contained clinically relevant lesions such as microaneurysms or hemorrhages with a high precision of .960 [.941–.976], surpassing post-hoc techniques applied to a standard DNN. Decision support by the model highlighting high-evidence regions in the image improved screening accuracy for difficult decisions and improved screening speed. This shows that inherently interpretable deep learning models can provide clinical decision support while obtaining state-of-the-art performance improving human-AI collaboration.


Evaluation of the virtual care experience for persons in prospective cohorts with HIV during the COVID pandemic

May 2025

·

12 Reads

The COVID pandemic necessitated shifting to virtual care. Our aim was to describe, and identify the challenges and satisfaction with the virtual care experience of a subset of participants from two established Canadian Trials Network (CTN) cohorts: CTN 222 (HIV/HCV coinfection) and CTN 314: CHANGE HIV (Correlates of Healthy Aging in geriatric HIV infection) - persons > 65 years age. We hypothesized that vulnerable populations could face challenges with virtual care related to age, mental health or drug addiction. Consenting participants provided demographic information, completed a non-validated 18-item self- administered questionnaire on their virtual care experience, and reported HIV specific laboratory collection and prescription refills during the COVID pandemic. Data on CD4 T lymphocyte counts and HIV viral loads were extracted from medical records. A total of 454 individuals participated between February 2021 and March 2023, including 133 from CTN 314 and 321 from CTN 222. Overall, 55.3% engaged in virtual care. In multivariable regression models (analysis with SAS and R software) use of virtual care was higher in the aging cohort (p < .0001) but did not vary with current alcohol, drug use or self-reported depression (p > .05). The most common reason for not engaging was that it was failure to offer. Of those who engaged, 55% reporting being very satisfied, 36.3% somewhat satisfied, and 8.8% not satisfied. Ten percent of the older and 16% of the HCV cohort, reported technology difficulties as a barrier to use. Those with a detectable HIV viral load were more likely to engage in virtual care, p < .05. 81.3% of participants had HIV blood tests as frequently as before the COVID-19 pandemic. Despite high satisfaction, the majority (80%) prefers in person visits. When offering virtual care, clinics need to ensure all eligible patients are aware of how to access the services and consider patient needs and preferences.


Can scientific literatures search be aided by conversational AIs
Can the search for scientific literature be improved by sser-support tools? Illustrations in panels a, b, and c were generated using Adobe Illustrator’s generative AI features
Relevant peer-reviewed publications retrieved: P1 [34], P2 [35], P3 [36], P4 [37], P5 [38], P6 [39], P7 [40], P8 [41], P9 [42], P10 [43], P11 [44].
Literature search results with augmented ChatGPT on high-interest topic with abundance of information (COVID- 19)
Relevant peer-reviewed publications retrieved: P1 [34], P2 [35], P3 [36], P4 [37], P5 [38], P6 [39], P7 [40], P8 [41].
Can augmented ChatGPT assist researchers in hypothesis generation
Artificial intelligence’s contribution to biomedical literature search: revolutionizing or complicating?

There is a growing number of articles about conversational AI (i.e., ChatGPT) for generating scientific literature reviews and summaries. Yet, comparative evidence lags its wide adoption by many clinicians and researchers. We explored ChatGPT’s utility for literature search from an end-user perspective through the lens of clinicians and biomedical researchers. We quantitatively compared basic versions of ChatGPT’s utility against conventional search methods such as Google and PubMed. We further tested whether ChatGPT user-support tools (i.e., plugins, web-browsing function, prompt-engineering, and custom-GPTs) could improve its response across four common and practical literature search scenarios: (1) high-interest topics with an abundance of information, (2) niche topics with limited information, (3) scientific hypothesis generation, and (4) for newly emerging clinical practices questions. Our results demonstrated that basic ChatGPT functions had limitations in consistency, accuracy, and relevancy. User-support tools showed improvements, but the limitations persisted. Interestingly, each literature search scenario posed different challenges: an abundance of secondary information sources in high interest topics, and uncompelling literatures for new/niche topics. This study tested practical examples highlighting both the potential and the pitfalls of integrating conversational AI into literature search processes, and underscores the necessity for rigorous comparative assessments of AI tools in scientific research.


MyChart activation methods* by month from September 12, 2023, to September 11, 2024
Monthly total MyChart logins by mobile app and web browser
Comparison of MyChart-active patients’ demographics to the patients who had visited our hospital at least once while MyChart was available
Uptake and user characteristics of MyChart within a Canadian community hospital with a diverse patient population: A comparative study

May 2025

·

10 Reads

Patient portals offer a convenient way to access health information and increase patient participation in healthcare. To promote broad accessibility and impact of portals, it is essential to understand uptake patterns across patient populations. This study described the characteristics of patient users of a portal called MyChart and compared them to non-users at a large community hospital. We descriptively analyzed (frequency, counts) patient health records to characterize MyChart users and their usage patterns during the first year of its launch from September 11, 2023, to September 112024. We summarized user demographics along with information about how they activated accounts, accessed MyChart, and utilized its features. Using chi-square and t-tests, we compared MyChart user demographics to non-users who visited the hospital in the same time period. A total of 61,306 patients activated MyChart during the first year it was available. On average, MyChart users were 53 years old, 62% female, 64% predicted to have White ethnicity, and preferred to receive healthcare in English (88%). MyChart users tended to be regular healthcare users, with an average of five annual visits prior to creating an account and logged onto the portal on average five times a month. MyChart users were slightly younger than non-users (an average age of 53.5 vs. 56.9 years) and visited the hospital more often (an average of 5.7 vs. 3.1 annual visits). Many patients activated MyChart during the first year of launch, and users closely resembled the broader patient population. To enhance adoption and potential benefits of patient portals, targeted interventions such as accessible educational information tailored to diverse patient groups (e.g., older adults, different ethnicities) could increase their usage.


Practitioners’ perspectives on implementation of acute virtual wards: A scoping review

May 2025

·

11 Reads

Virtual wards provide a promising alternative to traditional ‘bedded care’ by facilitating early discharges and delivering acute care at home. They focus specifically on patients needing acute care, which would traditionally necessitate an in-hospital stay. Understanding practitioners’ beliefs and attitudes is crucial for successful implementation and operation of Virtual wards. This scoping review explores practitioners’ perspectives on the implementation of virtual wards. A total of 18 studies were included in the final analysis from the 201 studies identified initially through searches in PubMed, Cochrane, CINAHL, and Embase databases (2015–2024) following PRISMA Extension for Scoping Reviews (PRISMA-ScR) guidelines. Thematic analysis was conducted using Braun and Clarke’s framework to identify key insights. Thematic analysis revealed key themes related to implementation, quality of care, technology, training, and awareness. These themes highlight the challenges influencing the adoption and considerations for the operational success of virtual wards. Virtual wards demonstrate significant potential for delivering acute care efficiently and sustainably. However, challenges related to service design, patient safety, technology integration, and workforce training must be addressed to ensure their successful implementation and long-term efficacy.


Wearables research for continuous monitoring of patient outcomes: A scoping review

May 2025

·

43 Reads

Background The use of wearable devices for remote health monitoring is a rapidly expanding field. These devices might benefit patients and providers; however, they are not yet widely used in healthcare. This scoping review assesses the current state of the literature on wearable devices for remote health monitoring in non-hospital settings. Methods CINAHL, Scopus, Embase and MEDLINE were searched until August 5, 2024. We performed citation searching and searched Google Scholar. Studies on wearable devices in an outpatient setting with a clinically relevant, measurable outcome were included and were categorized according to intended use of data: monitoring of existing disease vs. diagnosis of new disease. Results Eighty studies met eligibility criteria. Most studies used device data to monitor a chronic disease (68/80, 85%), most often neurodegenerative (22/68, 32%). Twelve studies (12/80, 15%) used device data to diagnose new disease, majority being cardiovascular (9/12, 75%). A range of wearable devices were studied with watches and bracelets being most common (50/80, 63%). Only six studies (8%) were randomized controlled trials, four of which (67%) showed evidence of positive clinical impact. Feasibility determinants were inconsistently reported, including compliance (51/80, 64%), patient-reported useability (13/80, 16%), and participant technology literacy (1/80, 1%). Conclusions Evidence for clinical effectiveness of wearable devices remains scant. Heterogeneity across studies in terms of devices, disease targets and monitoring protocols makes data synthesis challenging, especially given the rapid pace of technical innovation. These findings provide direction for future research and implementation of wearable devices in healthcare.


Clinical insights: A comprehensive review of language models in medicine

May 2025

·

19 Reads

·

1 Citation

This paper explores the advancements and applications of language models in healthcare, focusing on their clinical use cases. It examines the evolution from early encoder-based systems requiring extensive fine-tuning to state-of-the-art large language and multimodal models capable of integrating text and visual data through in-context learning. The analysis emphasizes locally deployable models, which enhance data privacy and operational autonomy, and their applications in tasks such as text generation, classification, information extraction, and conversational systems. The paper also highlights a structured organization of tasks and a tiered ethical approach, providing a valuable resource for researchers and practitioners, while discussing key challenges related to ethics, evaluation, and implementation.


Feasibility of HABIT-ILE@home in children with cerebral palsy and adults with chronic stroke: A pilot study

May 2025

·

52 Reads

Introduction Children with cerebral palsy (CP) and adults with chronic stroke (CS) usually have disabilities in voluntary motor control. Hand-Arm Bimanual Intensive Therapy Including Lower Extremities (HABIT-ILE), an evidence-based therapy, has always been provided during day camps. This pilot study investigates if HABIT-ILE@home, a remote neurorehabilitation, is feasible for children with CP and adults with CS. Methods Four children with CP (5-18y) and three adults with CS were recruited. They received 15h (5x3h) of HABIT-ILE@home provided by a caregiver with a remote supervision of 30min at the beginning and end of each session. A large touch screen, the REAtouch Lite, was used as a support for the therapy. An interview based on a questionnaire (n = 73 items for CP/ n = 74 items for stroke patients; scored from 0 “disagree” to 3 “agree”, a higher rating meaning a more positive aspect of the therapy) was conducted with patients and their caregivers after 15h of supervised home-therapy to assess their adherence to the treatment and the feasibility of HABIT-ILE@home. Performance and satisfaction in achieving functional goals were assessed before and after the intervention using the Canadian Occupational Performance Measure (COPM). Results Caregivers felt sufficiently supported by the supervision team (medians = 3) to carry out HABIT-ILE@home sessions thanks to an adequate clinical supervision (CP median = 2.6; CS median = 2.9). HABIT-ILE principles were transferable at patients’ home (CP median = 2.6; CS median = 2.8). The impact of the therapy on daily organization was more problematic for children’s caregivers (median = 1.5) than for adults’ caregivers (median = 3). Children with CP enjoyed the therapy (median = 2) but felt that it was too long (median = 1) and significant fatigue was present (median = 1.3). CS adults did not find the therapy fun (median = 1) but considered it as extremely useful (median = 3). Although the motivational source differed between children and adults, this did not seem to strongly affect adherence to treatment. Performance and satisfaction in achieving functional goals improved over the MCID (2 points) for all CS participants and for 3 out 4 CP children. Conclusion HABIT-ILE@home seems to be feasible for children with CP and adults with CS. It may allow more patients to benefit from an efficient neurorehabilitation, whatever sanitary conditions or patients’ home geographical locations.



Phases of the Innovation Funnel for Valuable AI in Healthcare [
21] (IFVAIH). Abbreviations: AIPA: Artificial Intelligence Prediction Algorithm.
Participants’ characteristics
Overview of themes, subcategories and mapping onto the IFVAIH Funnel
Exploring the complex nature of implementation of Artificial intelligence in clinical practice: an interview study with healthcare professionals, researchers and Policy and Governance Experts

May 2025

·

16 Reads

Artificial Intelligence (AI)-based tools have shown potential to optimize clinical workflows, enhance patient quality and safety, and facilitate personalized treatment. However, transitioning viable AI solutions to clinical implementation remains limited. To understand the challenges of bringing AI into clinical practice, we explored the experiences of healthcare professionals, researchers, and Policy and Governance Experts in hospitals. We conducted a qualitative study with thirteen semi-structured interviews (mean duration 52.1 ± 5.4 minutes) with healthcare professionals, researchers, and Policy and Governance Experts, with prior experience on AI development in hospitals. The interview guide was based on value, application, technology, governance, and ethics from the Innovation Funnel for Valuable AI in Healthcare, and the discussions were analyzed through thematic analysis. Six themes emerged: (1) demand-pull vs. tech-push: AI development focusing on innovative technologies may face limited success in large-scale clinical implementation. (2) Focus on generating knowledge, not solutions: Current AI initiatives often generate knowledge without a clear path for implementing AI models once proof-of-concept is achieved. (3) Lack of multidisciplinary collaboration: Successful AI initiatives require diverse stakeholder involvement, often hindered by late involvement and challenging communication. (4) Lack of appropriate skills: Stakeholders, including IT departments and healthcare professionals, often lack the required skills and knowledge for effective AI integration in clinical workflows. (5) The role of the hospital: Hospitals need a clear vision for integrating AI, including meeting preconditions in infrastructure and expertise. (6) Evolving laws and regulations: New regulations can hinder AI development due to unclear implications but also enforce standardization, emphasizing quality and safety in healthcare. In conclusion, this study highlights the complexity of AI implementation in clinical settings. Multidisciplinary collaboration is essential and requires facilitation. Balancing divergent perspectives is crucial for successful AI implementation. Hospitals need to assess their readiness for AI, develop clear strategies, standardize development processes, and foster better collaboration among stakeholders.


AI-driven personalized nutrition: RAG-based digital health solution for obesity and type 2 diabetes

May 2025

·

30 Reads

Effective management of obesity and type 2 diabetes is a major global public health challenge that requires evidence-based, scalable personalized nutrition solutions. Here, we present an artificial intelligence (AI) driven dietary recommendation system that generates personalized smoothie recipes while prioritizing health outcomes and environmental sustainability. A key feature of the system is the “virtual nutritionist”, an iterative validation framework that dynamically refines recipes to meet predefined nutritional and sustainability criteria. The system integrates dietary guidelines from the National Institute for Public Health and the Environment (RIVM), EUFIC, USDA FoodData Central, and the American Diabetes Association with retrieval-augmented generation (RAG) to deliver evidence-based recommendations. By aligning with the United Nations Sustainable Development Goals (SDGs), the system promotes plant-based, seasonal, and locally sourced ingredients to reduce environmental impact. We leverage explainable AI (XAI) to enhance user engagement through clear explanations of ingredient benefits and interactive features, improving comprehension across varying health literacy levels. Using zero-shot and few-shot learning techniques, the system adapts to user inputs while maintaining privacy through local deployment of the LLaMA3 model. In evaluating 1,000 recipes, the system achieved 80.1% adherence to health guidelines meeting targets for calories, fiber, and fats and 92% compliance with sustainability criteria, emphasizing seasonal and locally sourced ingredients. A prototype web application enables real-time, personalized recommendations, bridging the gap between AI-driven insights and clinical dietary management. This research underscores the potential of AI-driven precision nutrition to revolutionize chronic disease management by improving dietary adherence, enhancing health literacy, and offering a scalable, adaptable solution for clinical workflows, telehealth platforms, and public health initiatives, with the potential to significantly alleviate the global healthcare burden.


Details of the NIRUDAK derivation and validation studies. Both were prospective cohort studies recruiting adults and children 5 years and older at the the International Centre for Diarrhoeal Disease Research, Bangladesh’s (icddr,b) Dhaka Hospital
Candidate predictors
Baseline sociodemographic and clinical data
Performance of four modeling methods using three discrimination indices, including average dichotomous c-index (ADC), ordinal c-index (ORC), and generalized c-index (GC), applied to the original training dataset, three different bootstrap methods of internal validation, and to the testing dataset. 95% confidence intervals computed from 1000 bootstrap samples
Comparing the predictive discrimination of machine learning models for ordinal outcomes: A case study of dehydration prediction in patients with acute diarrhea

Many comparisons of statistical regression and machine learning algorithms to build clinical predictive models use inadequate methods to build regression models and do not have proper independent test sets on which to externally validate the models. Proper comparisons for models of ordinal categorical outcomes do not exist. We set out to compare model discrimination for four regression and machine learning methods in a case study predicting the ordinal outcome of severe, some, or no dehydration among patients with acute diarrhea presenting to a large medical center in Bangladesh using data from the NIRUDAK study derivation and validation cohorts. Proportional Odds Logistic Regression (POLR), penalized ordinal regression (RIDGE), classification trees (CART), and random forest (RF) models were built to predict dehydration severity and compared using three ordinal discrimination indices: ordinal c-index (ORC), generalized c-index (GC), and average dichotomous c-index (ADC). Performance was evaluated on models developed on the training data, on the same models applied to an external test set and through internal validation with three bootstrap algorithms to correct for overoptimism. RF had superior discrimination on the original training data set, but its performance was more similar to the other three methods after internal validation using the bootstrap. Performance for all models was lower on the prospective test dataset, with particularly large reduction for RF and RIDGE. POLR had the best performance in the test dataset and was also most efficient, with the smallest final model size. Clinical prediction models for ordinal outcomes, just like those for binary and continuous outcomes, need to be prospectively validated on external test sets if possible because internal validation may give a too optimistic picture of model performance. Regression methods can perform as well as more automated machine learning methods if constructed with attention to potential nonlinear associations. Because regression models are often more interpretable clinically, their use should be encouraged.


Digital citizen science for ethical monitoring of youth physical activity frequency: Comparing mobile ecological prospective assessments and retrospective recall

May 2025

·

12 Reads

Physical inactivity is a leading risk factor for mortality worldwide. Understanding youth patterns of moderate-to-vigorous physical activity (MVPA) is essential for addressing non-communicable diseases. Digital citizen science approaches, using citizen-owned smartphones for data collection, offer an ethical and innovative method for monitoring MVPA. This study compares the frequency of MVPA reported by youth using retrospective surveys and mobile ecological prospective momentary assessments (mEPAs) to explore the potential of digital citizen science for physical activity (PA) surveillance. Youth (N = 808) were recruited from Saskatchewan, Canada, between August and December 2018. Sixty-eight participants (ages 13–21) provided complete data on retrospective surveys (International Physical Activity Questionnaire, Simple Physical Activity Questionnaire, Global Physical Activity Questionnaire) and prospective mEPAs. Wilcoxon signed-rank tests compared retrospective and prospective MVPA frequencies, while negative binomial regression analysis examined associations between contextual factors and MVPA. Significant differences were found in the frequency of MVPA reported via retrospective surveys versus mEPAs (p < 0.000). Prospective MVPA was associated with family and friend support, having drug-free friends, part-time employment, and school distance, while retrospective MVPA frequency was associated with school and strength training. Digital citizen science, utilizing mEPAs, can provide more accurate and timely data on youth MVPA. With increasing smartphone access and digital literacy, mEPAs represent a promising method for developing effective and personalized MVPA recommendations for youth. However, these findings should be interpreted with caution, as the sample represents a small subset of youth, limiting generalizability to other youth populations.


Selection of participants for the focus groups with smokers
Selection of participants for the focus groups with smoking cessation professionals
Overview of hierarchically structured themes
No fill indicates that themes were coded inductively, yellow fill indicates that themes were coded deductively from the TDF, and blue fill indicates that themes were coded deductively from the TAM2.
Demographic, smoking, and professional characteristics of participants across focus groups
Exploring perspectives on digital smoking cessation just-in-time adaptive interventions: A focus group study with adult smokers and smoking cessation professionals

May 2025

·

10 Reads

Technology-mediated just-in-time adaptive interventions (JITAIs), which provide users with real-time, tailored behavioural support, are a promising innovation for smoking cessation. However, a greater understanding of stakeholder, including user, perspectives on JITAIs is needed. Focus groups with UK-based adult smokers (three groups; N = 19) and smoking cessation professionals (one group; N = 5) were conducted January-June 2024. Topic guides addressed the integration of a JITAI into users’ lives, preferred content and features, and data and privacy. Transcripts were analysed using inductive and deductive Framework Analysis; deductive codes were derived from the Theoretical Domains Framework and the Technology Acceptance Model. Four co-equal major themes, “Smoking Cessation Process”, “JITAI Characteristics”, “Perceived Value of the JITAI”, and “Relationship with the JITAI”, and 16 subordinate themes were identified. The smoking cessation process was described as a challenging and idiosyncratic, non-linear journey during which a JITAI should provide consistent support. Preferences for specific JITAI characteristics varied. However, participants consistently expressed that a JITAI should be highly personalised and offer both immediate, interruptive support and ambient, in-depth content. The perceived usefulness and ease of use of a JITAI were described as central to its perceived value. Participants stressed that a JITAI would need to be convenient enough to easily integrate into users’ daily lives, yet disruptive enough to facilitate behaviour change. Smokers expressed that they would want their relationship with a JITAI to feel supportive and non-judgmental. They also felt a JITAI should promote their autonomy. Smoking cessation professionals stressed the importance of privacy and data protection, whereas smokers appeared more ambivalent and had mixed opinions about this topic. JITAIs need to balance aspects of competing demands in their design, such as optimising for both convenience and sufficient disruption, promoting autonomy, and integrating interruptive and ambient content while also meeting stakeholder needs and expectations in terms of privacy.



Key factors influencing educational technology adoption in higher education: A systematic review

April 2025

·

78 Reads

In the current globalised educational environment, higher education increasingly relied on educational technology to enhance teaching and learning outcomes. Therefore, exploring the decisive factors influencing the adoption of educational technology was crucial for its successful implementation. This paper employed a systematic review using the PRISMA method to investigate four key dimensions affecting educational technology adoption: Performance Expectancy, Effort Expectancy, Social Influence, and Facilitating Conditions. Through a thorough examination of relevant literature, this review deepened the understanding of the core factors influencing the adoption process. A total of 1,891 studies related to educational technology adoption, published between 2015 and 2024, were initially identified, with 39 studies remaining after careful selection for analysis. The classification analysis revealed that all articles were categorised under the four themes: Performance Expectancy (8 articles), Effort Expectancy (18 articles), Social Influence (5 articles), and Facilitating Conditions (8 articles). This review provided valuable insights for higher education institutions aiming to enhance educational quality through the adoption of advanced educational technologies, and it also made a significant contribution to the existing academic literature. However, the interactions between the four dimensions warrant further exploration.



PRISMA Diagram for the systematic scoping review, search terms ‘(augmented reality OR mixed reality) AND surgery AND (consent OR patient education)’. No additional records were added after reference review of included studies
of papers included in the final systematic review
Cochrane risk of bias analysis for the intervention studies included in the review
Uses of augmented reality in surgical consent and patient education – A systematic review

April 2025

·

10 Reads

Augmented reality (AR) allows the real environment to be altered with superimposed graphics using a head-mounted-display (HMD), smartphone or tablet. AR in surgery is being explored as a potential disruptive technology and could be used to improve patient understanding of treatment and as an adjunct for surgery. The aim was to explore this use of AR and assess potential benefits for consent and patient education. A systematic review was conducted using PRISMA-SCR guidelines. 4 major bibliographic databases were searched using the terms: ‘(augmented reality OR mixed reality) AND surgery AND (consent OR patient education)’. Included papers evaluated an AR intervention on consenting patients for enhancing surgical consent or education about a procedure. Non-English language papers and studies which did evaluate an intervention were excluded. Three reviewers screened all abstracts and full text papers for inclusion. The review protocol was prospectively registered with PROSPERO (ID: CRD42020207360). 52 records were identified. Following removal of 13 duplicates, 21 were removed after abstract screening leaving 17 articles for full assessment. One article was a letter and 8 did not evaluate interventions, leaving 8 articles published between 2019 and 2023. 3 papers were randomised controlled trials comparing AR enhanced processes to standard consent, 2 cohort studies evaluated patient satisfaction with AR interventions and there was one randomised crossover trial of AR against traditional consent consultation. The Cochrane risk of bias tool was used most studies were deemed as high risk of bias. Patient satisfaction and understanding were improved using AR. However, advantages over other enhanced techniques are less clear. Using AR to enhance written literature was shown to require less mental effort from patients and was preferred to standard resources to understand complex surgery. The few randomised trials are limited by bias and lack of power calculation, highlighting the need for further research.


Journal metrics


$2,926

Article processing charge

Editors