PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

Background: An ever increasing number of artificial intelligence (AI) models targeting healthcare applications are developed and published every day, but their use in real world decision making is limited. Beyond a quantitative assessment, it is important to have qualitative evaluation of the maturity of these publications with additional details related to trends in type of data used, type of models developed across the healthcare spectrum. Methods: We assessed the maturity of selected peer-reviewed AI publications pertinent to healthcare for 2019 to 2021. For the report, the data collection was performed by PubMed search using machine learning OR artificial intelligence AND Healthcare with the English language and human subject research as of December 31, each year. All three years selected were manually classified into 34 distinct medical specialties. We used the Bidirectional Encoder Representations from Transformers (BERT) neural networks model to identify the maturity level of research publications based on their abstracts. We further classified a mature publication based on the healthcare specialty and geographical location of the article's senior author. Finally, we manually annotated specific details from mature publications, such as model type, data type, and disease type. Results: Of the 7062 publications relevant to AI in healthcare from 2019 to 2021, 385 were classified as mature. In 2019, 6.01 percent of publications were mature. 7.7 percent were mature in 2020, and 1.81 percent of publications were mature in 2021. Radiology publications had the most mature model publications across all specialties over the last three years, followed by pathology in 2019, ophthalmology in 2020, and gastroenterology in 2021. Geographical pattern analysis revealed a non-uniform distribution pattern. In 2019 and 2020, the United States ranked first with a frequency of 22 and 50, followed by China with 20 and 47. In 2021, China ranked first with 17 mature articles, followed by the United States with 11 mature articles. Imaging based data was the primary source, and deep learning was the most frequently used modeling technique in mature publications. Interpretation: Despite the growing number of publications of AI models in healthcare, only a few publications have been found to be mature with a potentially positive impact on healthcare. Globally, there is an opportunity to leverage diverse datasets and models across the health spectrum, to develop more mature models and related publications, which can fully realize the potential of AI to transform healthcare.
Quantitative and Qualitative evaluation of the recent Artificial Intelligence in Healthcare
publications using Deep-Learning
Authors: Raghav Awasthi, MSc.1;Shreya Mishra, MTech1; Jacek B Cywinski, MD2; Ashish K
Khanna4; Kamal Maheshwari, MD2; Francis A. Papay, MD3; Piyush Mathur, MD2
Affiliations: 1Indraprastha Institute of Technology(IIIT), Delhi, India ; 2Department of General
Anesthesiology, Anesthesiology Institute, Cleveland Clinic, Cleveland, OH, USA; 3Dermatology
and Plastic Surgery Institute, Cleveland Clinic, Cleveland, OH, USA; 4Wake Forest University
School of Medicine, Winston-Salem, NC, USA.
Corresponding author: Piyush Mathur MD,FCCM. Anesthesiology Institute, Cleveland Clinic,
E3-205,9500 Euclid Avenue,Cleveland,Ohio, USA 44195(pmathurmd@gmail.com)
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.
Abstract
Background:
An ever-increasing number of artificial intelligence (AI) models targeting healthcare applications
are developed and published every day, but their use in real-world decision-making is limited.
Beyond a quantitative assessment, it is important to have qualitative evaluation of the maturity of
these publications with additional details related to trends in type of data used, type of models
developed across the healthcare spectrum.
Methods:
We assessed the maturity of selected peer-reviewed AI publications pertinent to healthcare for the
years 2019–2021. For the report, the data collection was performed by PubMed search using the
Boolean operators "machine learning" OR "artificial intelligence" AND "2021”, OR "2020”, OR
''2019” with the English language and human subject research as of December 31, each year. All
three years selected were manually classified into 34 distinct medical specialties. We used the
Bidirectional Encoder Representations from Transformers (BERT) neural networks model to
identify the maturity level of research publications based on their abstracts. We further classified a
mature publication based on the healthcare specialty and geographical location of the article's
senior author. Finally, we manually annotated specific details from mature publications, such as
model type, data type, and disease type.
Results:
Of the 7062 publications relevant to AI in healthcare from 2019–2021, 385 were classified as
mature. In 2019, 6.01 percent of publications were mature. 7.7 percent were mature in 2020, and
1.81 percent of publications were mature in 2021. Radiology publications had the most mature
model publications across all specialties over the last three years, followed by pathology in 2019,
ophthalmology in 2020, and gastroenterology in 2021. Geographical pattern analysis revealed a
non-uniform distribution pattern. In 2019 and 2020, the United States ranked first with a
frequency of 22 and 50, followed by China with 20 and 47. In 2021, China ranked first with 17
mature articles, followed by the United States with 11 mature articles. Imaging-based data was the
primary source, and deep learning was the most frequently used modeling technique in mature
publications.
Interpretation:
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
Despite the growing number of publications of AI models in healthcare, only a few publications
have been found to be mature with a potentially positive impact on healthcare. Globally, there is
an opportunity to leverage diverse datasets and models across the health spectrum, to develop
more mature models and related publications, which can fully realize the potential of AI to
transform healthcare.
Research in Context
Evidence Before Study
There is an increasing number of publications related to AI in healthcare across different
specialities with limited assessment of maturity of these publications and a methodological
analysis of their key characteristics. We performed a PubMed search using combinations of the
keywords "maturity" or "evaluation" AND "AI in healthcare" restricted to the English language
and the past ten years of publications, and found 15 relevant publications. Six were focused on
proposing a qualitative framework for evaluating AI models, including one article proposing an
evaluation framework for prediction models and one article focusing on health economic
evaluations of AI in healthcare models. The remaining publications were related to the usability
of AI models. There are limited studies to assess the maturity of AI in healthcare publications
which provide further detailed insights into key compositional factors such as data types, model
types, geographical trends across different healthcare specialities.
The added value of this Study
With an exponentially increasing number of publications, to our knowledge, this is the first study
to provide a method, comprehensive quantitative and qualitative evaluation of the recent mature
"AI in Healthcare" publications. This study builds on a semi-automated approach that combines
deep learning with a unique in-house collection of "AI in Healthcare” publications over the recent
three years to highlight the current state of AI in healthcare. The whole spectrum of data types,
model types, geographical trends and diseases type represented in the mature publications are
presented empirically in this research which provides unique insights.
Implications of all the available evidence
This thorough and comparative evaluation of mature publications across different healthcare
specialities provides the evidence which can be used to guide future research and resource
utilization. Results from this study show that the percentage of mature publications in all
healthcare specialties is much lower than in radiology. Text and tabular data are also
underrepresented compared to image data in mature publications. Geographical trends of these
publications also shows the gaps in inclusivity and the need to provide resources to support AI in
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
healthcare research globally. Publications pertaining to the deep learning model have the highest
frequency of mature articles. Our detailed analysis of the mature AI in healthcare publications
demonstrates an opportunity to leverage heterogeneous datasets and models across the health
spectrum to increase the yield of mature AI in healthcare publications.
Introduction
Artificial intelligence in healthcare is defined as the capacity of computers to mimic human
cognition in the comprehension, analysis, and organization of complex medical and healthcare
data1.AI encompasses complex algorithms that learn from the data and help in data-driven
decision-making in uncertain situations. The basic objective of health-related AI applications is to
examine associations between clinical procedures and patient outcomes. AI systems are used in
diagnostics, treatment protocol creation, medication discovery, customized medicine, patient
monitoring, care, and drug development2. The excitement to build artificial intelligence-based
applications in healthcare is shared among clinicians, researchers, and industry3,4. Numerous
academic departments and start-ups are building AI models to solve clinical and administrative
problems. Since January 2020, numerous COVID-19-related AI models have helped in risk
stratification, diagnosis, or treatment development and have been proposed for implementation in
clinical care,5.
However, few AI models are being used in real-time for decision-making3. It seems imperative
that researchers working in this field can robustly assess the model before deployment. Quality
assessment of vast and ever-increasing AI models in healthcare is lagging6. In general, the quality
of AI models is assessed based on predefined criteria such as Accuracy, AUROC (Area under
receiver operating curve), F1-score, etc. However, it was evident that even high-performance AI
models have not realized their potential after trials for real-world clinical adoption7. This has
advocated for further validation, feasibility, and utility assessment of these AI models in clinical
environments. The language of published articles, which explain the details of AI models, is the
primary way to qualitatively evaluate models, which analyze their robustness and assess their
maturity. The time-consuming nature of reading papers and the need to understand AI and
healthcare make it difficult for humans to judge published publications. Evaluation of AI-based
publications in healthcare using AI itself has recently been developed and validated8.This determines
an answer to a maturity-level question: " Does the proposed model's output have a direct,
actionable impact on patient care by providing information to healthcare providers or automated
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
systems?" AI-based maturity models predict the level of maturity of article9. In other words,
maturity models, also known as 'capability frameworks, quantitatively assess the research article.
Systematic literature review and bibliometric analysis are commonly employed in all sciences to
gain an in-depth understanding of a particular study subject. Recently, a BERT (Bidirectional
Encoder Representation from Transformer) based language-based model was developed to assess
the quality of AI models in medical literature8. We have attempted to evaluate peer-reviewed
publications using BERT both quantitatively and also qualitatively using clinician-provided
annotation in selected healthcare publications from 2019, 2020, and 202110,11,12. We aimed to
understand areas of healthcare that have the most mature models and what we can learn from
them to advance AI in other healthcare areas. Through this evaluation framework, we have asked
three key questions: 1)Maturity of AI in healthcare publication in various medical specialties.
2)Geographical distribution of AI in mature healthcare publications. 3)Distribution of various
data types and model types utilized in AI in mature healthcare publications.
Methods
A rigorous pipeline was employed to analyze research papers in this study [Figure1]. First, we
utilized the recent three years of AI in healthcare publications 11,10,12 from PubMed, which had
then been manually classified into 34 distinct medical specialties. We determined the nation of
origin of the senior authors using the "location-tagger"13 python package, which employs the NER
(Named Entity Recognition) NLP task. Location-tagger can detect and extract locations
(countries, regions, states, and cities) from text or URLs and find relationships among countries,
regions, and cities.
Following that, we used the BERT neural networks8model to identify the maturity level of
research publications based on their abstracts. Finally, we manually annotated specific details
from the mature articles, such as model type, data type, and disease type.
AI in healthcare publication selection and data extraction
In this study, we used in-house data compiled for "Artificial Intelligence in Healthcare" reviews
for 2019–2021. Data collection was performed by PubMed search using the phrases "machine
learning" or "artificial intelligence" and "2021," "2020," and "2019" with the English language
and human subject research as of December 31, each year. This search produced a preliminary list
of 3351, 5885, and 4164 papers in 2019, 2020, and 2021 respectively. The papers were then
individually examined and excluded based on flaws in PubMed search results or relevance to this
study. Our final cohort included 1647, 3232, and 2182 papers chosen, examined, and classified
into one or more medical disciplines in the years 2019, 2020, and 2021 [Table1]. A significant
proportion of the excluded publications focused on robotic surgeries with no relevance to ML/AI,
specific gene research with limited therapeutic significance, non-human investigations, or brief
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
remarks. In each relevant specialty, 5% of articles relevant to two or more specializations were
mentioned. Most drug discovery-related publications, as well as some review or editorial articles,
were categorized as "General." Using the Python geocoding module, we determined the
geographical location of author connections. The location included in MEDLINE metadata refers
to the country of publication and not necessarily the country where the study was undertaken. We
determined the country of study based on the final corresponding author affiliation.
Healthcare specialty
2019
2020
2021
Administrative
76
102
72
Anesthesiology
14
38
18
Cardiology
88
188
119
COVID -19
0
322
134
Critical Care
32
41
24
Dermatology
35
45
30
Education
9
17
24
Emergency Medicine
8
18
10
Endocrinology
17
42
26
Gastroenterology
42
81
173
General
343
510
451
Genetics
114
120
65
Head & Neck
21
73
51
Nephrology
16
28
14
Neurology
70
172
92
Ob/Gyn
22
38
19
Oncology
106
219
214
Ophthalmology
56
132
82
Orthopedics/Rheumatology
20
48
24
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
Pathology
77
105
71
Pediatrics
31
39
25
Rehabilitation Medicine
17
41
14
Psychiatry
65
101
74
Pulmonary
19
38
21
Radiology
400
657
318
Surgery
47
141
84
Total (selected)
1647
3232
2182
Excluded
1704
2653
1982
Total (search results)
3351
5885
4164
Table 1. Publications related to artificial intelligence in healthcare [Total(selected) = Publications
selected after exclusions from initial PubMed search; Excluded = publications excluded based on
exclusion criteria; Total (search results) = Publications based on PubMed search]12
Maturity Model
We utilized an approach8developed to classify the research paper's maturity based on its abstract.
The title and abstract were utilized as a predictor of the paper's level of maturity. 2500 manually
labeled abstracts from 1998 to 2020 were utilized to fine-tune hyperparameters of the BERT
PubMed classifier. BERT is a deep learning model for NLP tasks that are built on transformers.
BERT's functioning completely depends on attentional mechanisms that understand the contextual
relationships between words in a text. The maturity classifier was validated on a test set (n=784)
and prospectively on abstracts from 2021 (n=2494). The test set model had an accuracy of 99
percent and a precision F1 score of 93%, while the prospective validation model had an accuracy
of 99 % and an F1 score of 91%. Lastly, when contrasted to curated publications from a
systematic review of AI versus Clinicians14, we have asserted that this maturity model uses joint
abstract and title of an article to forecast the paper's maturity.
Analysis
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
Using the model described above, we predicted the maturity of publications for the years 2019,
2020, and 2021 and conducted temporal analysis in the following way.
First, we have predicted a general pattern of research maturity from 2019 to 2021.
Second, we conducted a pattern analysis by healthcare specialty for 2019, 2020, and 2021.
Next, We examined the pattern of AI in mature healthcare articles through a global lens.
Finally, we have manually annotated the data type and model type for mature papers in
2019, 2020, and 2021.
Results
Maturity patterns by the year
103 (99 mature models, four systematic reviews) of the total 1647 publications published in 2019
were considered mature. In 2020, there were 3232 publications and 253 (250 mature models, 3
systematic reviews) that were deemed mature. In 2021, however, there was 1982 publications
total, and only 83 (36 mature models, 47 systematic reviews) were considered mature [Figure 2
(B)]. Percentage level estimations indicated non-monotonic patterns in the maturation tendencies
of publications. We categorized 6.01 percent of publications as mature in 2019, 7.7 percent of
publications as mature in 2020, and 1.81 percent of publications as mature in 2021. Since
systematic reviews do not provide concise information regarding the type of AI models and Data
use; hence, they were excluded from further analysis.
Maturity patterns by medical specialty
Different medical specialties pose unique challenges. Here, we have separated the
specialty-specific findings for all 34 specialties [Figure 2 (A)]. Radiology has the most mature
models across all specialties over three years, followed by Pathology in 2019, Ophthalmology in
2020, and Gastroenterology in 2021. Our analysis also found that the number of mature papers in
Gastroenterology, Oncology, and Ophthalmology has steadily increased from 2019 to 2021. In
2020 and 2021, the COVID-19 pandemic affected the entire world. Many researchers used
AI-based models to tackle this deadly infection leading to a significant number of publications.
However, our analysis reveals that only 4 and 1.6 percent of these COVID-19 related publications
were mature,in 2020 and 2021 respectively.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
Globally, cardiovascular diseases(CVD) are the major cause of mortality. In 2019, an estimated
17,9 million individuals died from CVDs, accounting for 32% of all deaths worldwide. 85 percent
of these fatalities were a result of heart attacks and strokes. In 2019, 2020, and 2021, there will be
88, 188, and 119 artificial intelligence models relevant to the prognosis and prevention of
cardiovascular illnesses. However, we discovered that the ranking of mature publications in
CVDs fell between 2019 and 2021 compared to other specialties.
Mature articles frequency distribution by the geographic location of the senior authors
We retrieved the country of the paper's senior author to investigate the variation of mature papers
at the level of each country [Figure 3]. We discovered a non-uniform distribution pattern. In
2019 and 2020, the United States ranked first with a frequency of 22 and 50, followed by China
with 20 and 47. In 2021, China ranked first with 17 mature articles, followed by the United States
with 11 mature articles. This indicates that mature publications are more frequent in developed
nations than in developing nations. However, our geo-map analysis revealed that developing
nations like India have also published mature AI in healthcare articles. For example, for India, we
saw that in 2019 - 2021, there were only four mature publications.
Comparison of Various Datasets and AI Models Employed in Mature Articles:
We manually annotated data types and AI models within the mature articles. We have primarily
categorized the data types as Image, Text, and Tabular, and model types as Deep learning (DL),
Classical machine learning (ML), Natural language processing (NLP), Probabilistic models,
Reinforcement learning (RL), and fundamental statistical models.Compared to textual and tabular
data, we discovered that the proportion of mature publications using image data is high [Figure
4A].
In 2019, 89% of mature publications incorporated image data, the same as in 2020 and 2021
(88.23% and 88.66%). From 2019 to 2020, the use of Tabular data in mature models declined
from 11% to 3%, and in 2021, no mature articles used tabular data. We also discovered that text
data in mature publications climbed by 8% from 2019 to 2022, with 11% of mature publications
using text data in 2021. We further subdivided the use of mature publications that included image
data by medical specialty. Image data were the most used in the specialty of Radiology, followed
by COVID-19 as a specialty disease and Ophthalmology.
Similarly, we saw that the proportion of mature publications using Deep learning models relative
to other AI models was very high [Figure 4B]. We observed that DL was used in 66% of all
mature publications in 2019, 71% in 2020, and 62% in 2021. According to our findings,
traditional machine learning models placed second behind deep learning models. 19, 34, and 11 of
all mature publications used machine learning techniques in the three years examined.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
Discussion
It’s no surprise that in recent years, 2019-2021, we identified thousands of peer-reviewed
publications related to healthcare artificial intelligence (AI). However, only 5% (385/7062) of the
publications were classified as mature, underscoring the urgent need for the development of
clinically relevant and deployable AI models.
Although AI development in healthcare is expanding globally, according to our geographical
pattern analysis, mature publications in AI in healthcare are concentrated in a handful of
countries. The United States continues to lead in the publication of mature models, closely
followed by China in year-over-year comparisons [Figure 3]8. We found some population areas,
such as South America, Eastern Europe, and Africa, to be underrepresented in AI publications,
which is concerning and can lead to the development of biased models and subsequently limit the
generalizability and scalability of developed AI solutions15. For advanced AI research the
availability of digitized data, healthcare information technology infrastructure, data scientists,
computing capabilities, and funding are critical components which evidently are concentrated in
developed countries.
To understand which specific healthcare specialities lead the AI research, we annotated the data
type and speciality for the mature publication and determined that imaging data was the most
prevalent. Imaging data has been the most utilized data type, probably due to easier access to
open-source data supported by various institutions such as Harvard, MIT, Stanford 16, and the
Radiological Society of North America (RSNA)17. Imaging data used to develop mature models
included various modalities, such as computed tomography (CT scans), magnetic resonance
imaging (MRI), and simple radiographs (X-rays). Early interest in adopting image-based AI for
ophthalmologic disease diagnosis, such as diabetic retinopathy, has also been supported by the
increased availability of fundoscopic images18. Imaging data in some mature models also
included cine loops, particularly in specialties such as Gastroenterology (endoscopy videos) and
Cardiology (echocardiography cine loops)19.
In 2009, Imagenet started off the revolution in the general use of image interpretation AI
solutions, especially Convolutional Neural Networks (CNN)20. Following similar patterns, in
healthcare, CNN continues to be the most commonly used AI model, particularly for the
interpretation of imaging data [Figure 4]. The proliferation of research in the automated
classification of lung nodules on chest X-rays or for stroke diagnosis has led to the further
development of mature models in Radiology21. Similarly, the adoption of AI for diagnosing
ophthalmologic diseases such as diabetic retinopathy spurred an increase in research and
development from industry and healthcare entities, which continue to evolve further and mature
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
22. In specialties such as cardiology and gastroenterology, the use of deep learning in enhancing
echocardiography image acquisition and interpretation or endoscopy has resulted in an increased
number of publications describing mature models23,24. Many of these models, after FDA approval,
have been embedded in medical devices or clinical workflows7. Unlike CNN-based models, large
language models or multimodal models have been developed more recently. Publications using
text data or multimodal data have been steadily increasing, and their maturity is improving25,26.
Readily available CNN algorithms and large imaging data repositories enabled radiology and
other image-based specialties such as ophthalmology, gastroenterology, oncology, and cardiology
to generate a huge growth of mature model publication. 27 [Figure 2 (A)]. Similar to radiology,
AI applications from these specialties are also being implemented in healthcare. Oncology-based
mature models are primarily based on imaging data with the use of deep learning algorithms 28
COVID - 19 presented a unique opportunity for researchers to apply some of the methods from
imaging-based modeling to interpret chest X-rays and CT scans, amongst others 29. Although
progress was made in a relatively short time to create mature models and publishing, adoption in
real life has been limited, especially now that the pandemic slowed down30.
While many other search methodologies to evaluate more publications using publication
databases such as Scopus could have been utilized, we decided to use Pubmed due to the ready
availability of a validated maturity model using PubMed and related data. Conference abstracts or
publications which were not in the English language might have led to some loss of data in our
evaluation. Still, we believe our methodology captures most of the publications and addresses the
purpose of our evaluation.Also, while there are various publication ranking methods, such as a
number of citations that can be used, they have limited value in shorter evaluation time frames.
The application of AI in healthcare has caught the imagination of many, leading to an exponential
rise in the number of publications over the past few years. Our evaluation demonstrates the
potential and the opportunity to utilize the available data fully and diverse AI models across the
world and the entire healthcare domain.
References:
1. Yu KH, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat Biomed Eng.
2018;2(10):719-731. doi:10.1038/s41551-018-0305-z
2. Shah P, Kendall F, Khozin S, et al. Artificial intelligence and machine learning in clinical
development: a translational perspective. Npj Digit Med. 2019;2(1):1-5.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
doi:10.1038/s41746-019-0148-3
3. Lee D, Yoon SN. Application of Artificial Intelligence-Based Technologies in the Healthcare
Industry: Opportunities and Challenges. Int J Environ Res Public Health. 2021;18(1):271.
doi:10.3390/ijerph18010271
4. Hinton G. Deep Learning—A Technology With the Potential to Transform Health Care. JAMA.
2018;320(11):1101-1102. doi:10.1001/jama.2018.11100
5. Bachtiger P, Peters NS, Walsh SL. Machine learning for COVID-19—asking the right
questions. Lancet Digit Health. 2020;2(8):e391-e392. doi:10.1016/S2589-7500(20)30162-X
6. Black AD, Car J, Pagliari C, et al. The impact of eHealth on the quality and safety of health
care: a systematic overview. PLoS Med. 2011;8(1):e1000387.
doi:10.1371/journal.pmed.1000387
7. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved
medical devices and algorithms: an online database. Npj Digit Med. 2020;3(1):1-8.
doi:10.1038/s41746-020-00324-0
8. Zhang J, Whebell S, Gallifant J, et al. An interactive dashboard to track themes, development
maturity, and global equity in clinical artificial intelligence research. Lancet Digit Health.
2022;4(4):e212-e213. doi:10.1016/S2589-7500(22)00032-2
9. Gomes J, Romão M. Information System Maturity Models in Healthcare. J Med Syst. 2018;42.
doi:10.1007/s10916-018-1097-0
10. Mathur P, Mummadi S, Khanna A, et al. 2019 YEAR IN REVIEW: MACHINE LEARNING
IN HEALTHCARE.; 2020. doi:10.13140/RG.2.2.34310.52800
11. Mathur P, Khanna A, Cywinski J, et al. Artificial Intelligence in Healthcare: 2020 Year in
Review.; 2021. doi:10.13140/RG.2.2.29325.05604
12. Mathur P, Mishra S, Awasthi R, et al. Artificial Intelligence in Healthcare: 2021 Year in
Review.; 2022. doi:10.13140/RG.2.2.25350.24645/1
13. Soni K. locationtagger: Detect & Extract locations from text or URL and find
relationships among locations. Accessed October 30, 2022.
https://github.com/kaushiksoni10/locationtagger
14. Nagendran M, Chen Y, Lovejoy CA, et al. Artificial intelligence versus clinicians:
systematic review of design, reporting standards, and claims of deep learning studies. BMJ.
2020;368:m689. doi:10.1136/bmj.m689
15. Ibrahim H, Liu X, Zariffa N, Morris AD, Denniston AK. Health data poverty: an
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
assailable barrier to equitable digital health care. Lancet Digit Health. 2021;3(4):e260-e265.
doi:10.1016/S2589-7500(20)30317-4
16. Johnson AEW, Pollard TJ, Berkowitz SJ, et al. MIMIC-CXR, a de-identified publicly
available database of chest radiographs with free-text reports. Sci Data. 2019;6(1):317.
doi:10.1038/s41597-019-0322-0
17. Publicly Accessible Data Needed to Develop AI Algorithms. Accessed October 22, 2022.
https://www.rsna.org/news/2021/february/accessible-data-for-ai-algorithms
18. Khan SM, Liu X, Nath S, et al. A global review of publicly available datasets for
ophthalmological imaging: barriers to access, usability, and generalisability. Lancet Digit
Health. 2021;3(1):e51-e66. doi:10.1016/S2589-7500(20)30240-5
19. Hirasawa T, Ikenoyama Y, Ishioka M, et al. Current status and future perspective of
artificial intelligence applications in endoscopic diagnosis and management of gastric cancer.
Dig Endosc Off J Jpn Gastroenterol Endosc Soc. 2021;33(2):263-272. doi:10.1111/den.13890
20. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. ImageNet: A large-scale hierarchical
image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. ;
2009:248-255. doi:10.1109/CVPR.2009.5206848
21. Hwang EJ, Park S, Jin KN, et al. Development and Validation of a Deep Learning–Based
Automated Detection Algorithm for Major Thoracic Diseases on Chest Radiographs. JAMA
Netw Open. 2019;2(3):e191095. doi:10.1001/jamanetworkopen.2019.1095
22. Gulshan V, Peng L, Coram M, et al. Development and Validation of a Deep Learning
Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA.
2016;316(22):2402-2410. doi:10.1001/jama.2016.17216
23. Spadaccini M, Iannone A, Maselli R, et al. Computer-aided detection versus advanced
imaging for detection of colorectal neoplasia: a systematic review and network meta-analysis.
Lancet Gastroenterol Hepatol. 2021;6(10):793-802. doi:10.1016/S2468-1253(21)00215-6
24. Ouyang D, He B, Ghorbani A, et al. Video-based AI for beat-to-beat assessment of cardiac
function. Nature. 2020;580(7802):252-256. doi:10.1038/s41586-020-2145-8
25. Chang D, Lin E, Brandt C, Taylor RA. Incorporating Domain Knowledge Into Language
Models by Using Graph Convolutional Networks for Assessing Semantic Textual Similarity:
Model Development and Performance Comparison. JMIR Med Inform. 2021;9(11):e23101.
doi:10.2196/23101
26. Tiu E, Talius E, Patel P, Langlotz CP, Ng AY, Rajpurkar P. Expert-level detection of
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
pathologies from unannotated chest X-ray images via self-supervised learning. Nat Biomed
Eng. Published online September 15, 2022:1-8. doi:10.1038/s41551-022-00936-9
27. Biousse V, Newman NJ, Najjar RP, et al. Optic Disc Classification by Deep Learning
versus Expert Neuro-Ophthalmologists. Ann Neurol. 2020;88(4):785-795.
doi:10.1002/ana.25839
28. Yu TF, He W, Gan CG, et al. Deep learning applied to two-dimensional color Doppler
flow imaging ultrasound images significantly improves diagnostic performance in the
classification of breast masses: a multicenter study. Chin Med J (Engl). 2021;134(4):415-424.
doi:10.1097/CM9.0000000000001329
29. Ardakani AA, Kanafi AR, Acharya UR, Khadem N, Mohammadi A. Application of deep
learning technique to manage COVID-19 in routine clinical practice using CT images: Results
of 10 convolutional neural networks. Comput Biol Med. 2020;121:103795.
doi:10.1016/j.compbiomed.2020.103795
30. Roberts M, Driggs D, Thorpe M, et al. Common pitfalls and recommendations for using
machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT
scans. Nat Mach Intell. 2021;3(3):199-217. doi:10.1038/s42256-021-00307-0
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
Figure 1: Methodology Pipeline: First, we used the most recent three years of AI in healthcare
papers from PubMed, which were then manually categorized into 34 medical disciplines. We
identified the nationality of senior authors. Then, we used the BERT neural networks model to
determine the degree of maturity of research articles based on their abstracts. Finally, we
manually annotated the mature publications with precise information, including model and data
types.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
Figure 2: A) Year wise pattern of mature publication by healthcare specialty: Normalized
heat map (where 0 represents the lowest number of mature publications and 1 represents the
highest number of mature publications) depicts 2019–2021, ranked medical specialty in mature
research publication. After classifying all of the chosen PubMed articles into one of 34 distinct
medical subspecialties, we identified the overall pattern. Radiology was ranked number one in all
three years. B) Overall maturity patterns by year: Bar graphs comparing the quantity of mature
and immature publications published in 2019, 2020, and 2021. In 2019, we determined that 6.01
percent of publications were mature; in 2020, we determined that 7.7 percent of publications were
mature; and in 2021, we determined that 1.81 percent of publications were mature.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
Figure 3: Year wise geographical pattern of mature publication: Geo-map presents the
frequency distribution of mature articles country-wise for three years. A non-uniform pattern over
three years was observed. In 2019 and 2020, the highest mature publications were from the USA;
in 2021, China had the highest mature publications.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
Figure 4: Analysis of data type and model type and disease: A) We subcategorized the data
type into three primary categories: Image, Tabular, and Text. Heat map illustrates the mature
article frequency in these three categories. The highest prevalence of Image data was recorded in
each of the three years. B) Model type was subcategorized into frequently used model types, such
as Deep Learning, Machine Learning, Natural language processing, Reinforcement learning, and
Statistical modeling. The greatest proportion of mature papers using deep learning models was
reported across all three years. C) Mature articles that used images were plotted according to their
frequency of appearance in each medical specialty. It was discovered that radiology was the top
among all specialties.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted January 4, 2023. ; https://doi.org/10.1101/2022.12.31.22284092doi: medRxiv preprint
... The use of BERT-based analysis to assess the maturity level of research publications based on their abstracts revealed that levels of maturity are low and keep decreasing [34]. Out of 7062 AI-related healthcare publications from 2019-2021, only 385 were classified as mature. ...
... The situation is further complicated by the rapid increase in the volume of scientific publications. As millions of papers are published annually across numerous disciplines [34], sifting through the information overload to identify pertinent and valuable studies has become progressively difficult. Therefore, AI approaches, growing in popularity, represent the future of scientific publishing and its educational potential. ...
Preprint
Full-text available
Artificial Intelligence (AI) is a rapidly progressing technology with its applications expanding exponentially over the past decade. While initial breakthroughs predominantly focused on deep learning and computer vision, recent advancements have facilitated a shift towards natural language processing and beyond. This includes generative models, like ChatGPT, capable of understanding the 'grammar' of software code, analog signals, and molecular structures. This research undertakes a comprehensive examination of AI trends within the biomedical domain, including the impact of ChatGPT. We explore scientific literature, clinical trials, and FDA-approval data, utilizing a thematic synthesis approach and bibliometric mapping of keywords to examine numerous subsets from over a hundred thousand unique records found in prominent public repositories up to mid- July 2023. Our analysis reveals a higher prevalence of general health-related publications compared to more specialized papers using or evaluating ChatGPT. However, the growth in specialized papers suggests a convergence with the trend observed for other AI tools. Our findings also imply a greater prevalence of publications using ChatGPT across multiple medical specialties compared to other AI tools, indicating its rising influence in complex fields requiring interdisciplinary collaboration. Leading topics in AI literature include radiology, ethics, drug discovery, COVID-19, robotics, brain research, stroke, and laparoscopy, indicating a shift from laboratory to emergency medicine and deep-learning-based image processing. Publications involving ChatGPT predominantly address current themes such as COVID-19, practical applications, interdisciplinary collaboration, and risk mitigation. Radiology retains dominance across all stages of biomedical R&D, spanning preprints, peer-reviewed papers, clinical trials, patents, and FDA approvals. Meanwhile, surgery-focused papers appear more frequently within ChatGPT preprints and case reports. Traditionally less represented areas, such as Pediatrics, Otolaryngology, and Internal Medicine, are starting to realize the benefits of ChatGPT, hinting at its potential to spark innovation within new medical sectors. AI application in geriatrics is notably underrepresented in publications. However, ongoing clinical trials are already exploring the use of ChatGPT for managing age-related conditions. The higher frequency of general health-related publications compared to specialized papers employing or evaluating ChatGPT showcases its broad applicability across multiple fields. AI, particularly ChatGPT, possesses significant potential to reshape the future of medicine. With millions of papers published annually across various disciplines, efficiently navigating the information deluge to pinpoint valuable studies has become increasingly challenging. Consequently, AI methods, gaining in popularity, are poised to redefine the future of scientific publishing and its educational reach. Despite challenges like quality of training data and ethical concerns, prevalent in preceding AI tools, the wider applicability of ChatGPT across diverse fields is manifest. This review employed the PRISMA tool and numerous overlapping data sources to minimize bias risks.
... The clinical oncology-based investigation is primarily concerned with determining how tumor cells proliferate to shed light on the disease's biological origin. Additionally, it sought to address the growing global cancer mortality toll by using machine learning to handle massive amounts of data from millions of appropriate situations [19] . Furthermore, it is anticipated that applying AI to medical decision-making could facilitate the use of high-resolution imaging and NGS to forecast and detect disorders early. ...
Article
Full-text available
Cancer stem cells (CSCs), or tumor-initiating cells (TICs), are cancerous cell subpopulations that remain while tumor cells propagate as a unique subset and exhibit multiple applications in several diseases. They are responsible for cancer cell initiation, development, metastasis, proliferation, and recurrence due to their self-renewal and differentiation abilities in many kinds of cells. Artificial intelligence (AI) has gained significant attention because of its vast applications in various fields including agriculture, healthcare, transportation, and robotics, particularly in detecting human diseases such as cancer. The division and metastasis of cancerous cells are not easy to identify at early stages due to their uncontrolled situations. It has provided some real-time pictures of cancer progression and relapse. The purpose of this review paper is to explore new investigations into the role of AI in cancer stem cell progression and metastasis and in regenerative medicines. It describes the association of machine learning and AI with CSCs along with its numerous applications from cancer diagnosis to therapy. This review has also provided key challenges and future directions of AI in cancer stem cell research diagnosis and therapeutic approach.
Chapter
“Artificial Intelligence-AI” exhibits and appears as a groundbreaking strength in the orbit of biomedicine, advancing progressed results to a myriad of encounters. AI-driven models are reshaping the landscape of research, diagnosis, treatment, drug discovery, and patient care. This abstract explores some of the mainly impactful uses of AI in biomedicine. In the domain of diagnostics, AI-powered figure investigation systems have made substantial strides in enhancing accuracy and speed when detecting diseases like cancer and radiological anomalies. “Machine learning algorithms”, notably “convolutional neural networks”, have evidenced exceptional efficiency in interpreting medical images, thereby reducing the workload on radiologists and pathologists. Moreover, AI has expedited the discovery of drug procedures. By scrutinizing extensive datasets of molecular arrangements and genomic information, AI can predict potential drug candidates and their efficacy. This acceleration not only hastens the development of novel therapeutics but also mitigates costs. Patient care has undergone significant improvement through AI. Chatbots and virtual health assistants now offer round-the-clock support and monitor patients’ conditions, facilitating early intervention. Predictive analytics play a pivotal role in identifying high-risk patients and tailoring personalized treatment plans, ultimately enhancing outcomes. AI has ushered in a revolution in genomics, enabling the analysis of the massive amount of information created by next-generation technologies of sequencing. This advancement facilitates personalized medicine by identifying genetic markers and potential treatment options for individual patients. In the field of epidemiology, AI models have proven indispensable for tracking disease outbreaks and forecasting their trajectories. Natural language processing is instrumental in analyzing medical literature and social media data to detect early indicators of emerging health threats. AI-based models have transformative applications spanning the entire spectrum of biomedicine, from diagnostics and drug discovery to patient care and epidemiology. They hold the promise of transforming healthcare into a more efficient, accurate, and personalized system, ultimately improving the superiority of life for loads of entities worldwide.
Preprint
Full-text available
Background: The infodemic we are experiencing with AI related publications in healthcare is unparalleled. The excitement and fear surrounding the adoption of rapidly evolving AI in healthcare applications pose a real challenge. Collaborative learning from published research is one of the best ways to understand the associated opportunities and challenges in the field. To gain a deep understanding of recent developments in this field, we have conducted a quantitative and qualitative review of AI in healthcare research articles published in 2023. Methods: We performed a PubMed search using the terms, machine learning or artificial intelligence and 2023, restricted to English language and human subject research as of December 31, 2023 on January 1, 2024. Utilizing a Deep Learning-based approach, we assessed the maturity of publications. Following this, we manually annotated the healthcare specialty, data utilized, and models employed for the identified mature articles. Subsequently, empirical data analysis was performed to elucidate trends and statistics.Similarly, we performed a search for Large Language Model(LLM) based publications for the year 2023. Results: Our PubMed search yielded 23,306 articles, of which 1,612 were classified as mature. Following exclusions, 1,226 articles were selected for final analysis. Among these, the highest number of articles originated from the Imaging specialty (483), followed by Gastroenterology (86), and Ophthalmology (78). Analysis of data types revealed that image data was predominant, utilized in 75.2% of publications, followed by tabular data (12.9%) and text data (11.6%). Deep Learning models were extensively employed, constituting 59.8% of the models used. For the LLM related publications,after exclusions, 584 publications were finally classified into the 26 different healthcare specialties and used for further analysis. The utilization of Large Language Models (LLMs), is highest in general healthcare specialties, at 20.1%, followed by surgery at 8.5%. Conclusion: Image based healthcare specialities such as Radiology, Gastroenterology and Cardiology have dominated the landscape of AI in healthcare research for years. In the future, we are likely to see other healthcare specialties including the education and administrative areas of healthcare be driven by the LLMs and possibly multimodal models in the next era of AI in healthcare research and publications.
Article
Full-text available
In tasks involving the interpretation of medical images, suitably trained machine-learning models often exceed the performance of medical experts. Yet such a high-level of performance typically requires that the models be trained with relevant datasets that have been painstakingly annotated by experts. Here we show that a self-supervised model trained on chest X-ray images that lack explicit annotations performs pathology-classification tasks with accuracies comparable to those of radiologists. On an external validation dataset of chest X-rays, the self-supervised model outperformed a fully supervised model in the detection of three pathologies (out of eight), and the performance generalized to pathologies that were not explicitly annotated for model training, to multiple image-interpretation tasks and to datasets from multiple institutions.
Preprint
Full-text available
A review of over 4000+ articles published in 2021 related to artificial intelligence in healthcare.A BrainX Community exclusive, annual publication which has trends, specialist editorials and categorized references readily available to provide insights into related 2021 publications. Cite as: Mathur P, Mishra S, Awasthi R, Cywinski J, et al. (2022). Artificial Intelligence in Healthcare: 2021 Year in Review. DOI: 10.13140/RG.2.2.25350.24645/1
Article
Full-text available
Data-driven digital health technologies have the power to transform health care. If these tools could be sustainably delivered at scale, they might have the potential to provide everyone, everywhere, with equitable access to expert-level care, narrowing the global health and wellbeing gap. Conversely, it is highly possible that these transformative technologies could exacerbate existing health-care inequalities instead. In this Viewpoint, we describe the problem of health data poverty: the inability for individuals, groups, or populations to benefit from a discovery or innovation due to a scarcity of data that are adequately representative. We assert that health data poverty is a threat to global health that could prevent the benefits of data-driven digital health technologies from being more widely realised and might even lead to them causing harm. We argue that the time to act is now to avoid creating a digital health divide that exacerbates existing health-care inequalities and to ensure that no one is left behind in the digital era.
Preprint
Full-text available
A review of Artificial Intelligence in Healthcare publications for the year 2020. Overall, 3000+ references classified based various healthcare specialities with abstracts from leading clinician specialists.Reviews year over year trends for the last 3 years related to this field.
Article
Full-text available
Background: Although electronic health record systems have facilitated clinical documentation in health care, they have also introduced new challenges, such as the proliferation of redundant information through the use of copy and paste commands or templates. One approach to trimming down bloated clinical documentation and improving clinical summarization is to identify highly similar text snippets with the goal of removing such text. Objective: We developed a natural language processing system for the task of assessing clinical semantic textual similarity. The system assigns scores to pairs of clinical text snippets based on their clinical semantic similarity. Methods: We leveraged recent advances in natural language processing and graph representation learning to create a model that combines linguistic and domain knowledge information from the MedSTS data set to assess clinical semantic textual similarity. We used bidirectional encoder representation from transformers (BERT)-based models as text encoders for the sentence pairs in the data set and graph convolutional networks (GCNs) as graph encoders for corresponding concept graphs that were constructed based on the sentences. We also explored techniques, including data augmentation, ensembling, and knowledge distillation, to improve the model's performance, as measured by the Pearson correlation coefficient (r). Results: Fine-tuning the BERT_base and ClinicalBERT models on the MedSTS data set provided a strong baseline (Pearson correlation coefficients: 0.842 and 0.848, respectively) compared to those of the previous year's submissions. Our data augmentation techniques yielded moderate gains in performance, and adding a GCN-based graph encoder to incorporate the concept graphs also boosted performance, especially when the node features were initialized with pretrained knowledge graph embeddings of the concepts (r=0.868). As expected, ensembling improved performance, and performing multisource ensembling by using different language model variants, conducting knowledge distillation with the multisource ensemble model, and taking a final ensemble of the distilled models further improved the system's performance (Pearson correlation coefficients: 0.875, 0.878, and 0.882, respectively). Conclusions: This study presents a system for the MedSTS clinical semantic textual similarity benchmark task, which was created by combining BERT-based text encoders and GCN-based graph encoders in order to incorporate domain knowledge into the natural language processing pipeline. We also experimented with other techniques involving data augmentation, pretrained concept embeddings, ensembling, and knowledge distillation to further increase our system's performance. Although the task and its benchmark data set are in the early stages of development, this study, as well as the results of the competition, demonstrates the potential of modern language model-based systems to detect redundant information in clinical notes.
Article
Full-text available
This study examines the current state of artificial intelligence (AI)-based technology applications and their impact on the healthcare industry. In addition to a thorough review of the literature, this study analyzed several real-world examples of AI applications in healthcare. The results indicate that major hospitals are, at present, using AI-enabled systems to augment medical staff in patient diagnosis and treatment activities for a wide range of diseases. In addition, AI systems are making an impact on improving the efficiency of nursing and managerial activities of hospitals. While AI is being embraced positively by healthcare providers, its applications provide both the utopian perspective (new opportunities) and the dystopian view (challenges to overcome). We discuss the details of those opportunities and challenges to provide a balanced view of the value of AI applications in healthcare. It is clear that rapid advances of AI and related technologies will help care providers create new value for their patients and improve the efficiency of their operational processes. Nevertheless, effective applications of AI will require effective planning and strategies to transform the entire care service and operations to reap the benefits of what technologies offer.
Article
Full-text available
Health data that are publicly available are valuable resources for digital health research. Several public datasets containing ophthalmological imaging have been frequently used in machine learning research; however, the total number of datasets containing ophthalmological health information and their respective content is unclear. This Review aimed to identify all publicly available ophthalmological imaging datasets, detail their accessibility, describe which diseases and populations are represented, and report on the completeness of the associated metadata. With the use of MEDLINE, Google's search engine, and Google Dataset Search, we identified 94 open access datasets containing 507 724 images and 125 videos from 122 364 patients. Most datasets originated from Asia, North America, and Europe. Disease populations were unevenly represented, with glaucoma, diabetic retinopathy, and age-related macular degeneration disproportionately overrepresented in comparison with other eye diseases. The reporting of basic demographic characteristics such as age, sex, and ethnicity was poor, even at the aggregate level. This Review provides greater visibility for ophthalmological datasets that are publicly available as powerful resources for research. Our paper also exposes an increasing divide in the representation of different population and disease groups in health data repositories. The improved reporting of metadata would enable researchers to access the most appropriate datasets for their needs and maximise the potential of such resources.
Article
Background Computer-aided detection (CADe) techniques based on artificial intelligence algorithms can assist endoscopists in detecting colorectal neoplasia. CADe has been associated with an increased adenoma detection rate, a key quality indicator, but the utility of CADe compared with existing advanced imaging techniques and distal attachment devices is unclear. Methods For this systematic review and network meta-analysis, we did a comprehensive search of PubMed/Medline, Embase, and Scopus databases from inception to Nov 30, 2020, for randomised controlled trials investigating the effectiveness of the following endoscopic techniques in detecting colorectal neoplasia: CADe, high definition (HD) white-light endoscopy, chromoendoscopy, or add-on devices (ie, systems that increase mucosal visualisation, such as full spectrum endoscopy [FUSE] or G-EYE balloon endoscopy). We collected data on adenoma detection rates, sessile serrated lesion detection rates, the proportion of large adenomas detected per colonoscopy, and withdrawal times. A frequentist framework, random-effects network meta-analysis was done to compare artificial intelligence with chromoendoscopy, increased mucosal visualisation systems, and HD white-light endoscopy (the control group). We estimated odds ratios (ORs) for the adenoma detection rate, sessile serrated lesion detection rate, and proportion of large adenomas detected per colonoscopy, and calculated mean differences for withdrawal time, with 95% CIs. Risk of bias and certainty of evidence were assessed with the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach. Findings 50 randomised controlled trials, comprising 34 445 participants, were included in our main analysis (six trials of CADe, 18 of chromoendoscopy, and 26 of increased mucosal visualisation systems). HD white-light endoscopy was the control technique in all 50 studies. Compared with the control technique, the adenoma detection rate was 7·4% higher with CADe (OR 1·78 [95% CI 1·44–2·18]), 4·4% higher with chromoendoscopy (1·22 [1·08–1·39]), and 4·1% higher with increased mucosal visualisation systems (1·16 [1·04–1·28]). CADe ranked as the superior technique for adenoma detection (with moderate confidence in hierarchical ranking); cross-comparisons of CADe with other imaging techniques showed a significant increase in the adenoma detection rate with CADe versus increased mucosal visualisation systems (OR 1·54 [95% CI 1·22–1·94]; low certainty of evidence) and with CADe versus chromoendoscopy (1·45 [1·14–1·85]; moderate certainty of evidence). When focusing on large adenomas (≥10 mm) there was a significant increase in the detection of large adenomas only with CADe (OR 1·69 [95% CI 1·10–2·60], moderate certainty of evidence) when compared to HD white-light endoscopy; CADe ranked as the superior strategy for detection of large adenomas. CADe also seemed to be the superior strategy for detection of sessile serrated lesions (with moderate confidence in hierarchical ranking), although no significant increase in the sessile serrated lesion detection rate was shown (OR 1·37 [95% CI 0·65–2·88]). No significant difference in withdrawal time was reported for CADe compared with the other techniques. Interpretation Based on the published literature, detection rates of colorectal neoplasia are higher with CADe than with other techniques such as chromoendoscopy or tools that increase mucosal visualisation, supporting wider incorporation of CADe strategies into community endoscopy services. Funding None.
Article
Image recognition using artificial intelligence (AI) has progressed significantly due to innovative technologies such as machine learning and deep learning. In the field of gastric cancer (GC) management, research on AI‐based diagnosis such as anatomical classification of endoscopic images, diagnosis of Helicobacter pylori infection, and detection and qualitative diagnosis of GC is being conducted, and an accuracy equivalent to that of physicians has been reported. It is expected that AI will soon be introduced in the field of endoscopic diagnosis and management of gastric cancer as a supportive tool for physicians, thus improving the quality of medical care.