ArticleLiterature Review

Artificial intelligence in veterinary diagnostic imaging: Perspectives and limitations

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... As pets increasingly assume roles analogous to human family members, owners expect compassionate care supported by state-of-the-art diagnostics. Yet, current veterinary oncology often lags behind human oncology in adopting emerging technologies [2,7]. ...
... Artificial Intelligence (AI) is transforming veterinary oncology, especially in diagnostic imaging, by enhancing tumor detection and characterization in companion animals and addressing challenges related to accessibility and early diagnosis [7,21]. Conventional imaging techniques, such as Magnetic Resonance Imaging (MRI), Computed Tomography (CT), and ultrasound, have long been essential tools for cancer diagnosis in veterinary medicine [22]. ...
... A recent study highlighted a deep learning algorithm's ability to classify meningeal-based and intra-axial lesions in canine brain MRI scans with over 90% accuracy, a task often considered complex for radiologists [27]. Another study found that AI-assisted MRI improved the precision of tumor volume measurements compared to manual methods, while AI-driven ultrasound reduced false-negative rates for liver tumors in dogs, showcasing its potential to mitigate human diagnostic errors and enhance clinical decision-making [7]. ...
Article
Full-text available
Cancer is a leading cause of death among companion animals, with many cases diagnosed at advanced stages when clinical signs have appeared, and prognosis is poor. Emerging diagnostic technologies, including Artificial Intelligence (AI)-enhanced imaging, liquid biopsies, molecular diagnostics, and nematode-based screening, can improve early detection capabilities in veterinary medicine. These tools offer non-invasive or minimally invasive methods to facilitate earlier detection and treatment planning, addressing the limitations of traditional diagnostics, such as radiography and tissue biopsies. Recent advancements in comparative oncology, which leverage the biological similarities between human and companion animal cancers, underscore their translational value in improving outcomes across species. Technological advances in genomics, bioinformatics, and machine learning are driving a shift toward precision medicine, enabling earlier detection, personalized treatments, and monitoring of disease progression. Liquid biopsy testing detects circulating tumor DNA and tumor cells, providing actionable insights into tumor genetics without invasive procedures. Imaging systems enhance diagnostic precision, offering consistent and accurate tumor identification across veterinary practices, while portable innovations like Caenorhabditis elegans-based screening provide accessible options for underserved regions. As these technologies migrate from human medicine to veterinary applications, they are poised to redefine cancer care for companion animals. This review highlights key advancements in diagnostic technologies and their application in veterinary oncology, with a focus on enhancing early detection, accessibility, and precision in cancer care. By fostering the adoption of these innovations, veterinary oncology can achieve a new standard of care, improving outcomes for both animals and humans through the lens of comparative oncology.
... Despite promising advancements, several challenges hinder the widespread adoption of AI in veterinary diagnostics:  Data Quality and Availability: A significant limitation is the scarcity of high-quality, annotated datasets for training AI models. Research by Burti et al. (2024) [13] identified this issue as a critical barrier to developing robust diagnostic systems.  Interpretability: There is also a pressing need for AI algorithms to be interpretable. ...
... Despite promising advancements, several challenges hinder the widespread adoption of AI in veterinary diagnostics:  Data Quality and Availability: A significant limitation is the scarcity of high-quality, annotated datasets for training AI models. Research by Burti et al. (2024) [13] identified this issue as a critical barrier to developing robust diagnostic systems.  Interpretability: There is also a pressing need for AI algorithms to be interpretable. ...
Research Proposal
Full-text available
The integration of artificial intelligence (AI) into veterinary diagnostic imaging holds significant promise for enhancing both the accuracy and efficiency of disease detection in animals. By leveraging advanced computational algorithms, AI can assist veterinarians in interpreting complex imaging data, ultimately leading to improved patient care and outcomes.
... This diverse collection of information improves accuracy and time-to-diagnosis for diseases with low prevalence or low index of suspicion. [22][23][24][25] Artificial intelligence-enhanced diagnostic tools also working toward improving sample analysis in clinics and in laboratories. For example, in-clinic cellular analyzers 26 and rapid tests for antimicrobial resistance 27,28 could detect abnormalities and resistant pathogens in near real time, enabling early diagnoses and timely, targeted treatments. ...
Article
Full-text available
The field of veterinary medicine, like many others, is expected to undergo a significant transformation due to artificial intelligence (AI), although the full extent remains unclear. Artificial intelligence is already becoming prominent throughout daily life (eg, recommending movies, completing text messages, predicting traffic), yet many people do not realize they interact with it regularly. Despite its prevalence, opinions on AI in veterinary medicine range from skepticism to optimism to indifference. However, we are living through a key moment that calls for a balanced perspective, as the way we choose to address AI now will shape the future of the field. Future generations may view us as either overly optimistic, blinded by AI's allure, or overly pessimistic, failing to recognize its potential. By understanding how algorithms function and predictions are made, we can begin to demystify AI, seeing it not as an all-knowing entity but as a powerful tool that will assist veterinary professionals in providing high-level care and progressing in the field. Building awareness allows us to appreciate its strengths and limitations and recognize the ethical dilemmas that may arise. This review aims to provide an accessible overview of the status of AI in veterinary medicine. This review is not intended to be an exhaustive account of AI.
... The use of AI methods has revolutionized many different aspects of health sciences, especially by enhancing our capabilities to extract quantitative information from digital images that can then be used to predict the presence of lesions in digitized glass slides [42], radiographs [43], or ultrasound images [24]. To date, few CVS have been developed to monitor slaughter lesions in pigs [26-28, 44, 45]. ...
Article
Full-text available
Cranioventral pulmonary consolidation (CVPC) is a common lesion observed in the lungs of slaughtered pigs, often associated with Mycoplasma ( M. ) hyopneumoniae infection. There is a need to implement simple, fast, and valid CVPC scoring methods. Therefore, this study aimed to compare CVPC scores provided by a computer vision system (CVS; AI DIAGNOS) from lung images obtained at slaughter, with scores assigned by human evaluators. In addition, intra- and inter-evaluator variability were assessed and compared to intra-CVS variability. A total of 1050 dorsal view images of swine lungs were analyzed. Total lung lesion score, lesion score per lung lobe, and percentage of affected lung area were employed as outcomes for the evaluation. The CVS showed moderate accuracy (62–71%) in discriminating between non-lesioned and lesioned lung lobes in all but the diaphragmatic lobes. A low multiclass classification accuracy at the lung lobe level (24–36%) was observed. A moderate to high inter-evaluator variability was noticed depending on the lung lobe, as shown by the intraclass correlation coefficient (ICC: 0.29–0.6). The intra-evaluator variability was low and similar among the different outcomes and lung lobes, although the observed ICC slightly differed among evaluators. In contrast, the CVS scoring was identical per lobe per image. The results of this study suggest that the CVS AI DIAGNOS could be used as an alternative to the manual scoring of CVPC during slaughter inspections due to its accuracy in binary classification and its perfect consistency in the scoring.
... Patterns of animal behaviour, animal movement, and habitat are closely monitored by AI applications that serve for conservation planning (Rast et al., 2020). Additionally, some AI applications merits (Burti et al., 2024). The availability and quality of reliable data is identified as the primary challenge which the backbone of a productive machine-learning effort The available data in veterinary medicine being smaller, and fragmented, limits its exploitation to the fullest potential . ...
... Li and Zhang [8] introduced a Regressive Vision Transformer for dog cardiomegaly assessment, effectively capturing contextual information in radiographic images. Additionally, Burti et al. [2] explored the application of artificial intelligence in veterinary diagnostic imaging, discussing both the prospects and limitations of such technologies. ...
... To właśnie wizja komputerowa jest wskazywana jako najbardziej przełomowa dziedzina zastosowań AI w weterynarii 47 . Co ważne Polscy informatycy (tradycyjnie z Krakowa 48 ) prowadzą badania na światowym poziomie (jednak we współpracy z lek. ...
Preprint
Full-text available
Streszczenie: Gospodarstwa z Polski i regionu mają duży potencjał w obecnych czasach zmian wdrażając systemy oparte na sztucznej inteligencji do zarządzania żywym inwentarzem, szczególnie w przypadku bydła mlecznego i świń. Demokratyzacja sztucznej inteligencji za pomocą niedrogich urządzeń (takich jak telewizja przemysłowa i smartfony) stanowi potencjał dla opłacalnej rewolucji w medycynie weterynaryjnej. Jednak nadal istnieją wyzwania związane ze skalowaniem rozwiązań precyzyjnej hodowali opartych na sztucznej inteligencji i równoważenie wydajności z dobrostanem zwierząt. Pomimo kilku udanych studiów przypadków w regionie, odstajemy znacznie w innowacyjności w tym obszarze od Europy Zachodniej czy USA. Dlatego podkreślam znaczenie integracji edukacji w zakresie sztucznej inteligencji z programami nauczania zootechników i lekarzy weterynarii. Wprowadzenie We współczesnych czasach, w których technologia staje się coraz lepsza, temat sztucznej inteligencji (AI) zaczyna obejmować prawie każdy aspekt każdej dziedziny nauki i badań 1. AI reprezentuje inteligencję maszyn w przeciwieństwie do inteligencji ludzi lub innych żywych form. Sztuczną inteligencję można uznać za jako działanie "inteligentnych proxy" 2 , tj. może przyjąć formę dowolnego algorytmu lub urządzenia, które może zrozumieć i ocenić swój obszar otoczenia i zacząć odpowiednio działać, aby zoptymalizować i w pełni zrealizować swoje cele i zadania. Forma inteligencji, w której algorytmy mogą się uczyć 3 i wdrażać nową technologię/wiedzę, jest określana jako uczenie maszynowe (ML). Biorąc pod uwagę rosnącą ewolucję AI/ML, naturalne jest, że naukowcy i analitycy zaczynają badać możliwości takiej technologii we wszystkich dziedzinach nauki i przemysłu 4. Inteligentna hodowla zwierząt nie jest wyjątkiem od tej fali modernizacji sprzętu i oprogramowania, inwestując w zwiększenie wymagań dotyczących zarządzania żywnością i
... Recently, the integration of Artificial intelligence (AI) and machine learning (ML) with human medical imaging has helped overcome some of the limitations of traditional methodologies in fracture detection, classification [65,66,67] as well as for segmentation procedures required for developing FE models. Manual segmentation, which is both time-consuming and prone to error, along with the subjective nature of diagnostic methods, are among the limitations of current methods [68]. Additionally, SCB biomechanics is a complex research area which requires analysing the interactions between mechanical loads and the responses of heterogenous and nonlinear biological tissues, especially in fatigue injury and stress fracture research. ...
Article
Full-text available
Purpose of Review This review synthesizes recent advancements in understanding subchondral bone (SCB) biomechanics using computed tomography (CT) and micro-computed tomography (micro-CT) imaging in large animal models, particularly horses. Recent Findings Recent studies highlight the complexity of SCB biomechanics, revealing variability in density, microstructure, and biomechanical properties across the depth of SCB from the joint surface, as well as at different joint locations. Early SCB abnormalities have been identified as predictive markers for both osteoarthritis (OA) and stress fractures. The development of standing CT systems has improved the practicality and accuracy of live animal imaging, aiding early diagnosis of SCB pathologies. Summary While imaging advancements have enhanced our understanding of SCB, further research is required to elucidate the underlying mechanisms of joint disease and articular surface failure. Combining imaging with mechanical testing, computational modelling, and artificial intelligence (AI) promises earlier detection and better management of joint disease. Future research should refine these modalities and integrate them into clinical practice to enhance joint health outcomes in veterinary and human medicine.
... The integration of artificial intelligence (AI), particularly convolutional neural networks (CNNs), into diagnostic imaging has marked a significant advancement. Silvia Burti has explored various applications of AI across different imaging modalities, such as radiology, ultrasound, CT, and MRI, demonstrating AI's potential to enhance diagnos-tic accuracy [2]. In veterinary medicine, the use of CNNs, such as ResNet50V2, has shown promising results in diagnosing conditions like cardiomegaly and pulmonary patterns in dogs, providing more consistent and objective measurements compared to traditional manual methods [1]. ...
... This gap highlights a critical area of potential research and development in veterinary diagnostic techniques. Silvia Burti has explored the various applications of artificial intelligence (AI) across different imaging modalities, including radiology, ultrasound, computed tomography (CT), and magnetic resonance imaging (MRI), highlighting AI's potential to enhance diagnostic accuracy [2]. The integration of convolutional neural networks into image classification marks a transformative phase in both general and medical imaging fields. ...
Article
This study introduces a specialized convolutional neural network (CNN), developed using the PyTorch framework, created for classifying canine cardiac images from the Dog Heart dataset. Designed to surpass conventional models such as VGG16, our CNN achieved a classification accuracy of 73.5%. This achievement underscores the model's capacity to effectively identify intricate patterns within the images, marking a advance in veterinary radiology by enhancing the diagnostic accuracy for canine conditions. The demonstrated efficacy of this CNN in medical image analysis highlights the potential of customized neural network configuration to optimize diagnostic processes. Future research will aim to refine this model further to boost predic-tive accuracy, and seeking to improve and streamline clinical practices in veterinary medicine.
... The use of AI methods has revolutionized many different aspects of health sciences, especially by enhancing our capabilities to extract quantitative information from digital images that can then be used to predict the presence of lesions in digitized glass slides [42], radiographs [43], or ultrasound images [24]. To date, few CVS have been developed to monitor slaughter lesions in pigs [26-28, 44, 45]. ...
Article
In this analytical cross‐sectional method comparison study, we evaluated brain MR images in 30 dogs and cats with and without using a DICOM‐based deep‐learning (DL) denoising algorithm developed specifically for veterinary patients. Quantitative comparison was performed by measuring signal‐to‐noise (SNR) and contrast‐to‐noise ratios (CNR) on the same T2‐weighted (T2W), T2‐FLAIR, and Gradient Echo (GRE) MR brain images in each patient (native images and after denoising) in identical regions of interest. Qualitative comparisons were then conducted: three experienced veterinary radiologists independently evaluated each patient's T2W, T2‐FLAIR, and GRE image series. Native and denoised images were evaluated separately, with observers blinded to the type of images they were assessing. For each image type (native and denoised) and pulse sequence type image, they assigned a subjective grade of coarseness, contrast, and overall quality. For all image series tested (T2W, T2‐FLAIR, and GRE), the SNRs of cortical gray matter, subcortical white matter, deep gray matter, and internal capsule were statistically significantly higher on images treated with DL denoising algorithm than native images. Similarly, for all image series types tested, the CNRs between cortical gray and white matter and between deep gray matter and internal capsule were significantly higher on DL algorithm‐treated images than native images. The qualitative analysis confirmed these results, with generally better coarseness, contrast, and overall quality scores for the images treated with the DL denoising algorithm. In this study, this DICOM‐based DL denoising algorithm reduced noise in 1.5T MRI canine and feline brain images, and radiologists’ perceived image quality improved.
Article
Full-text available
Objective To capture veterinary professionals’ perspectives and applications of AI in veterinary care. This study assesses the perceived benefits, challenges, and potential areas where AI could enhance veterinary medicine and practice workflows. Methods An online survey was distributed to members of the American Animal Hospital Association and Digitail's network of veterinary professionals. The questionnaire included 18 close-ended and 7 open-ended questions exploring awareness, perceptions, usage, expectations, and concerns about AI in veterinary medicine. The survey was open from December 19, 2023, through January 8, 2024. Results The survey gathered 3,968 responses from professionals in various veterinary roles. Most respondents were veterinarians and veterinary technicians, with an average age of 35. Conclusions Respondents demonstrated varying familiarity with AI, with an overall positive outlook toward its adoption in veterinary medicine. Those who actively use AI tools in their professional tasks reported higher levels of optimism about its integration. Key concerns included the reliability and accuracy of AI in diagnosis and treatment. The top benefits identified by respondents included improving efficiencies, streamlining administrative tasks, and potential contributions to revenue growth, employee satisfaction, and client retention. Clinical Relevance The findings underscore the influence of practical exposure and experience with AI tools on attitudes toward AI adoption. The positive correlation suggests that familiarity with AI technologies fosters trust and confidence, consequently driving greater acceptance and adoption within the veterinary community.
Article
Full-text available
The developing interest in a productive and precise analysis of sheep sicknesses has prompted the improvement of an expert system to help veterinarians and ranchers pursue informed choices. This paper presents the plan and execution of a custom-made expert system for the knowledge-based determination and treatment of sheep infections. The system uses a standard-based approach, coordinating expert information as legitimate predicates and induction rules, to analyze regular infections influencing sheep and suggest fitting medicines. The system's knowledge base is built from information about the domain, like signs, symptoms, and environmental factors. This system aims to support proactive health management in sheep farming, reduce the need for immediate veterinary intervention, and improve disease diagnosis efficiency by simulating expert decision-making processes. The execution in Prolog exhibits the attainability and flexibility of the framework in authentic situations. The users evaluated the system with an accuracy rate of diagnosis of 89%, farmer feedback with an 87% success rate in treatment recommendation, and 91% matched with the human experts in identifying common sheep diseases. These advancements have the potential to completely transform the management of sheep's health, thereby enhancing the welfare and global agricultural productivity. Future systems may incorporate machine learning models to enhance rule-based techniques and increase diagnostic precision and flexibility. Through the examination of past data and the identification of trends, these systems have the potential to forecast disease epidemics or offer more accurate diagnoses. Mobile platforms and IoT technologies could make real-time livestock health monitoring possible, relieving farmers of manually entering data. Wearable technology could monitor physiological parameters, and environmental sensors could offer more information about the weather, temperature, and sanitization. Future systems may combine sensor-based analysis and image recognition to reduce reliance on human observation to identify visible disease signs, such as lesions or aberrant behavior.
Article
Full-text available
The analysis of veterinary radiographic imaging data is an essential step in the diagnosis of many thoracic lesions. Given the limited time that physicians can devote to a single patient, it would be valuable to implement an automated system to help clinicians make faster but still accurate diagnoses. Currently, most of such systems are based on supervised deep learning approaches. However, the problem with these solutions is that they need a large database of labeled data. Access to such data is often limited, as it requires a great investment of both time and money. Therefore, in this work we present a solution that allows higher classification scores to be obtained using knowledge transfer from inter-species and inter-pathology self-supervised learning methods. Before training the network for classification, pretraining of the model was performed using self-supervised learning approaches on publicly available unlabeled radiographic data of human and dog images, which allowed substantially increasing the number of images for this phase. The self-supervised learning approaches included the Beta Variational Autoencoder, the Soft-Introspective Variational Autoencoder, and a Simple Framework for Contrastive Learning of Visual Representations. After the initial pretraining, fine-tuning was performed for the collected veterinary dataset using 20% of the available data. Next, a latent space exploration was performed for each model after which the encoding part of the model was fine-tuned again, this time in a supervised manner for classification. Simple Framework for Contrastive Learning of Visual Representations proved to be the most beneficial pretraining method. Therefore, it was for this method that experiments with various fine-tuning methods were carried out. We achieved a mean ROC AUC score of 0.77 and 0.66, respectively, for the laterolateral and dorsoventral projection datasets. The results show significant improvement compared to using the model without any pretraining approach.
Article
Full-text available
The aim of this study was to develop and test an artificial intelligence (AI)-based algorithm for detecting common technical errors in canine thoracic radiography. The algorithm was trained using a database of thoracic radiographs from three veterinary clinics in Italy, which were evaluated for image quality by three experienced veterinary diagnostic imagers. The algorithm was designed to classify the images as correct or having one or more of the following errors: rotation, underexposure, overexposure, incorrect limb positioning, incorrect neck positioning, blurriness, cut-off, or the presence of foreign objects, or medical devices. The algorithm was able to correctly identify errors in thoracic radiographs with an overall accuracy of 81.5% in latero-lateral and 75.7% in sagittal images. The most accurately identified errors were limb mispositioning and underexposure both in latero-lateral and sagittal images. The accuracy of the developed model in the classification of technically correct radiographs was fair in latero-lateral and good in sagittal images. The authors conclude that their AI-based algorithm is a promising tool for improving the accuracy of radiographic interpretation by identifying technical errors in canine thoracic radiographs.
Article
Full-text available
An algorithm based on artificial intelligence (AI) was developed and tested to classify different stages of myxomatous mitral valve disease (MMVD) from canine thoracic radiographs. The radiographs were selected from the medical databases of two different institutions, considering dogs over 6 years of age that had undergone chest X-ray and echocardiographic examination. Only radiographs clearly showing the cardiac silhouette were considered. The convolutional neural network (CNN) was trained on both the right and left lateral and/or ventro-dorsal or dorso-ventral views. Each dog was classified according to the American College of Veterinary Internal Medicine (ACVIM) guidelines as stage B1, B2 or C + D. ResNet18 CNN was used as a classification network, and the results were evaluated using confusion matrices, receiver operating characteristic curves, and t-SNE and UMAP projections. The area under the curve (AUC) showed good heart-CNN performance in determining the MMVD stage from the lateral views with an AUC of 0.87, 0.77, and 0.88 for stages B1, B2, and C + D, respectively. The high accuracy of the algorithm in predicting the MMVD stage suggests that it could stand as a useful support tool in the interpretation of canine thoracic radiographs.
Article
Full-text available
Nephrolithiasis is one of the most common urinary disorders in dogs. Although a majority of kidney calculi are non-obstructive and are likely to be asymptomatic, they can lead to parenchymal loss and obstruction as they progress. Thus, early diagnosis of kidney calculi is important for patient monitoring and better prognosis. However, detecting kidney calculi and monitoring changes in the sizes of the calculi from computed tomography (CT) images is time-consuming for clinicians. This study, in a first of its kind, aims to develop a deep learning model for automatic kidney calculi detection using pre-contrast CT images of dogs. A total of 34,655 transverseimage slices obtained from 76 dogs with kidney calculi were used to develop the deep learning model. Because of the differences in kidney location and calculi sizes in dogs compared to humans, several processing methods were used. The first stage of the models, based on the Attention U-Net (AttUNet), was designed to detect the kidney for the coarse feature map. Five different models–AttUNet, UTNet, TransUNet, SwinUNet, and RBCANet–were used in the second stage to detect the calculi in the kidneys, and the performance of the models was evaluated. Compared with a previously developed model, all the models developed in this study yielded better dice similarity coefficients (DSCs) for the automatic segmentation of the kidney. To detect kidney calculi, RBCANet and SwinUNet yielded the best DSC, which was 0.74. In conclusion, the deep learning model developed in this study can be useful for the automated detection of kidney calculi.
Article
Full-text available
A large-scale postmortem auditing of antemortem imaging diagnoses has yet to be accomplished in veterinary medicine. For this retrospective, observational, single-center, diagnostic accuracy study, necropsy reports for patients of The Schwarzman Animal Medical Center were collected over a 1-year period. Each necropsy diagnosis was determined to be either correctly diagnosed or discrepant with its corresponding antemortem diagnostic imaging, and discrepancies were categorized. The radiologic error rate was calculated to include only clinically significant missed diagnoses (lesion was not reported but was retrospectively visible on the image) and misinterpretations (lesion was noted but was incorrectly diagnosed). Nonerror discrepancies, such as temporal indeterminacy, microscopic limitations, sensitivity limitations, and study-type limitations were not included in the error rate. A total of 1099 necropsy diagnoses had corresponding antemortem imaging; 440 diagnoses were classified as major diagnoses, of which 176 were discrepant, for a major discrepancy rate of 40%, similar to reports in people. Seventeen major discrepancies were diagnoses that were missed or misinterpreted by the radiologist, for a calculated radiologic error rate of 4.6%, comparable with error rates of 3%-5% reported in people. From 2020 to 2021, nearly half of all clinically significant abnormalities noted at necropsy went undetected by antemortem imaging, though most discrepancies owed to factors other than radiologic error. Identifying common patterns of misdiagnosis and discrepancy will help radiologists refine their analysis of imaging studies to potentially reduce interpretive error.
Article
Full-text available
This paper provides the first comprehensive analysis of ethical issues raised by artificial intelligence (AI) in veterinary medicine for companion animals. Veterinary medicine is a socially valued service, which, like human medicine, will likely be significantly affected by AI. Veterinary AI raises some unique ethical issues because of the nature of the client–patient–practitioner relationship, society’s relatively minimal valuation and protection of nonhuman animals and differences in opinion about responsibilities to animal patients and human clients. The paper examines how these distinctive features influence the ethics of AI systems that might benefit clients, veterinarians and animal patients—but also harm them. It offers practical ethical guidance that should interest ethicists, veterinarians, clinic owners, veterinary bodies and regulators, clients, technology developers and AI researchers.
Article
Full-text available
Background Radiotherapy (RT) is increasingly being used on dogs with spontaneous head and neck cancer (HNC), which account for a large percentage of veterinary patients treated with RT. Accurate definition of the gross tumor volume (GTV) is a vital part of RT planning, ensuring adequate dose coverage of the tumor while limiting the radiation dose to surrounding tissues. Currently the GTV is contoured manually in medical images, which is a time-consuming and challenging task. Purpose The purpose of this study was to evaluate the applicability of deep learning-based automatic segmentation of the GTV in canine patients with HNC. Materials and methods Contrast-enhanced computed tomography (CT) images and corresponding manual GTV contours of 36 canine HNC patients and 197 human HNC patients were included. A 3D U-Net convolutional neural network (CNN) was trained to automatically segment the GTV in canine patients using two main approaches: (i) training models from scratch based solely on canine CT images, and (ii) using cross-species transfer learning where models were pretrained on CT images of human patients and then fine-tuned on CT images of canine patients. For the canine patients, automatic segmentations were assessed using the Dice similarity coefficient (Dice), the positive predictive value, the true positive rate, and surface distance metrics, calculated from a four-fold cross-validation strategy where each fold was used as a validation set and test set once in independent model runs. Results CNN models trained from scratch on canine data or by using transfer learning obtained mean test set Dice scores of 0.55 and 0.52, respectively, indicating acceptable auto-segmentations, similar to the mean Dice performances reported for CT-based automatic segmentation in human HNC studies. Automatic segmentation of nasal cavity tumors appeared particularly promising, resulting in mean test set Dice scores of 0.69 for both approaches. Conclusion In conclusion, deep learning-based automatic segmentation of the GTV using CNN models based on canine data only or a cross-species transfer learning approach shows promise for future application in RT of canine HNC patients.
Article
Full-text available
Kidney volume is associated with renal function and the severity of renal diseases, thus accurate assessment of the kidney is important. Although the voxel count method is reported to be more accurate than several methods, its laborious and time-consuming process is considered as a main limitation. In need of a new technology that is fast and as accurate as the manual voxel count method, the aim of this study was to develop the first deep learning model for automatic kidney detection and volume estimation from computed tomography (CT) images of dogs. A total of 182,974 image slices from 386 CT scans of 211 dogs were used to develop this deep learning model. Owing to the variance of kidney size and location in dogs compared to humans, several processing methods and an architecture based on UNEt Transformers which is known to show promising results for various medical image segmentation tasks including this study. Combined loss function and data augmentation were applied to elevate the performance of the model. The Dice similarity coefficient (DSC) which shows the similarity between manual segmentation and automated segmentation by deep-learning model was 0.915 ± 0.054 (mean ± SD) with post-processing. Kidney volume agreement analysis assessing the similarity between the kidney volume estimated by manual voxel count method and the deep-learning model was r = 0.960 (p < 0.001), 0.95 from Lin's concordance correlation coefficient (CCC), and 0.975 from the intraclass correlation coefficient (ICC). Kidney volume was positively correlated with body weight (BW), and insignificantly correlated with body conditions score (BCS), age, and sex. The correlations between BW, BCS, and kidney volume were as follows: kidney volume = 3.701 × BW + 11.962 (R² = 0.74, p < 0.001) and kidney volume = 19.823 × BW/BCS index + 10.705 (R² = 0.72, p < 0.001). The deep learning model developed in this study is useful for the automatic estimation of kidney volume. Furthermore, a reference range established in this study for CT-based normal kidney volume considering BW and BCS can be helpful in assessment of kidney in dogs.
Article
Full-text available
Since most of degenerative canine heart diseases accompany cardiomegaly, early detection of cardiac enlargement is main priority healthcare issue for dogs. In this study, we developed a new deep learning-based radiographic index quantifying canine heart size using retrospective data. The proposed “adjusted heart volume index” (aHVI) was calculated as the total area of the heart multiplied by the heart’s height and divided by the fourth thoracic vertebral body (T4) length from simple lateral X-rays. The algorithms consist of segmentation and measurements. For semantic segmentation, we used 1000 dogs’ radiographic images taken between Jan 2018 and Aug 2020 at Seoul National University Veterinary Medicine Teaching Hospital. The tversky loss functions with multiple hyperparameters were used to capture the size-unbalanced regions of heart and T4. The aHVI outperformed the current clinical standard in predicting cardiac enlargement, a common but often fatal health condition for small old dogs.
Article
Full-text available
While still in its infancy, the application of deep convolutional neural networks in veterinary diagnostic imaging is a rapidly growing field. The preferred deep learning architecture to be employed is convolutional neural networks, as these provide the structure preferably used for the analysis of medical images. With this retrospective exploratory study, the applicability of such networks for the task of delineating certain organs with respect to their surrounding tissues was tested. More precisely, a deep convolutional neural network was trained to segment medial retropharyngeal lymph nodes in a study dataset consisting of CT scans of canine heads. With a limited dataset of 40 patients, the network in conjunction with image augmentation techniques achieved an intersection‐overunion of overall fair performance (median 39%, 25 percentiles at 22%, 75 percentiles at 51%). The results indicate that these architectures can indeed be trained to segment anatomic structures in anatomically complicated and breed‐related variating areas such as the head, possibly even using just small training sets. As these conditions are quite common in veterinary medical imaging, all routines were published as an open‐source Python package with the hope of simplifying future research projects in the community.
Article
Full-text available
Artificial intelligence (AI) is being applied in medicine to improve healthcare and advance health equity. The application of AI-based technologies in radiology is expected to improve diagnostic performance by increasing accuracy and simplifying personalized decision-making. While this technology has the potential to improve health services, many ethical and societal implications need to be carefully considered to avoid harmful consequences for individuals and groups, especially for the most vulnerable populations. Therefore, several questions are raised, including (1) what types of ethical issues are raised by the use of AI in medicine and biomedical research, and (2) how are these issues being tackled in radiology, especially in the case of breast cancer? To answer these questions, a systematic review of the academic literature was conducted. Searches were performed in five electronic databases to identify peer-reviewed articles published since 2017 on the topic of the ethics of AI in radiology. The review results show that the discourse has mainly addressed expectations and challenges associated with medical AI, and in particular bias and black box issues, and that various guiding principles have been suggested to ensure ethical AI. We found that several ethical and societal implications of AI use remain underexplored, and more attention needs to be paid to addressing potential discriminatory effects and injustices. We conclude with a critical reflection on these issues and the identified gaps in the discourse from a philosophical and STS perspective, underlining the need to integrate a social science perspective in AI developments in radiology in the future.
Article
Full-text available
Thoracic radiograph (TR) is a complementary exam widely used in small animal medicine which requires a sharp analysis to take full advantage of Radiographic Pulmonary Pattern (RPP). Although promising advances have been made in deep learning for veterinary imaging, the development of a Convolutional Neural Networks (CNN) to detect specifically RPP from feline TR images has not been investigated. Here, a CNN based on ResNet50V2 and pre-trained on ImageNet is first fine-tuned on human Chest X-rays and then fine-tuned again on 500 annotated TR images from the veterinary campus of VetAgro Sup (Lyon, France). The impact of manual segmentation of TR’s intrathoracic area and enhancing contrast method on the CNN’s performances has been compared. To improve classification performances, 200 networks were trained on random shuffles of training set and validation set. A voting approach over these 200 networks trained on segmented TR images produced the best classification performances and achieved mean Accuracy, F1-Score, Specificity, Positive Predictive Value and Sensitivity of 82%, 85%, 75%, 81% and 88% respectively on the test set. Finally, the classification schemes were discussed in the light of an ensemble method of class activation maps and confirmed that the proposed approach is helpful for veterinarians.
Article
Full-text available
The use of artificial intelligence (AI) algorithms in diagnostic radiology is a developing area in veterinary medicine and may provide substantial benefit in many clinical settings. These range from timely image interpretation in the emergency setting when no boarded radiologist is available to allowing boarded radiologists to focus on more challenging cases that require complex medical decision making. Testing the performance of artificial intelligence (AI) software in veterinary medicine is at its early stages, and only a scant number of reports of validation of AI software have been published. The purpose of this study was to investigate the performance of an AI algorithm (Vetology AI®) in the detection of pleural effusion in thoracic radiographs of dogs. In this retrospective, diagnostic case–controlled study, 62 canine patients were recruited. A control group of 21 dogs with normal thoracic radiographs and a sample group of 41 dogs with confirmed pleural effusion were selected from the electronic medical records at the Cummings School of Veterinary Medicine. The images were cropped to include only the area of interest (i.e., thorax). The software then classified images into those with pleural effusion and those without. The AI algorithm was able to determine the presence of pleural effusion with 88.7% accuracy (P < 0.05). The sensitivity and specificity were 90.2% and 81.8%, respectively (positive predictive value, 92.5%; negative predictive value, 81.8%). The application of this technology in the diagnostic interpretation of thoracic radiographs in veterinary medicine appears to be of value and warrants further investigation and testing.
Article
Full-text available
Veterinary medicine is a broad and growing discipline that includes topics such as companion animal health, population medicine and zoonotic diseases, and agriculture. In this article, we provide insight on how artificial intelligence works and how it is currently applied in veterinary medicine. We also discuss its potential in veterinary medicine. Given the rapid pace of research and commercial product developments in this area, the next several years will pose challenges to understanding, interpreting, and adopting this powerful and evolving technology. Artificial intelligence has the potential to enable veterinarians to perform tasks more efficiently while providing new insights for the management and treatment of disorders. It is our hope that this will translate to better quality of life for animals and those who care for them.
Article
Full-text available
Background Previous studies evaluating the accuracy of computed tomography (CT) in detecting caudal vena cava (CVC) invasion by adrenal tumors (AT) used a binary system and did not evaluate for other vessels. Objective Test a 7‐point scale CT grading system for accuracy in predicting vascular invasion and for repeatability among radiologists. Build a decision tree based on CT criteria to predict tumor type. Methods Retrospective observational cross‐sectional case study. Abdominal CT studies were analyzed by 3 radiologists using a 7‐point CT grading scale for vascular invasion and by 1 radiologist for CT features of AT. Animals Dogs with AT that underwent adrenalectomy and had pre‐ and postcontrast CT. Results Ninety‐one dogs; 45 adrenocortical carcinomas (50%), 36 pheochromocytomas (40%), 9 adrenocortical adenomas (10%) and 1 unknown tumor. Carcinoma and pheochromocytoma differed in pre‐ and postcontrast attenuation, contralateral adrenal size, tumor thrombus short‐ and long‐axis, and tumor and thrombus mineralization. A decision tree was built based on these differences. Adenoma and malignant tumors differed in contour irregularity. Probability of vascular invasion was dependent on CT grading scale, and a large equivocal zone existed between 3 and 6 scores, lowering CT accuracy to detect vascular invasion. Radiologists' agreement for detecting abnormalities (evaluated by chance‐corrected weighted kappa statistics) was excellent for CVC and good to moderate for other vessels. The quality of postcontrast CT study had a negative impact on radiologists' performance and agreement. Conclusions and Clinical Importance Features of CT may help radiologists predict AT type and provide probabilistic information on vascular invasion.
Article
Full-text available
Heart disease is a leading cause of death among cats and dogs. Vertebral heart scale (VHS) is one tool to quantify radiographic cardiac enlargement and to predict the occurrence of congestive heart failure. The aim of this study was to evaluate the performance of artificial intelligence (AI) performing VHS measurements when compared with two board-certified specialists. Ground truth consisted of the average of constituent VHS measurements performed by board-certified specialists. Thirty canine and 30 feline thoracic lateral radiographs were evaluated by each operator, using two different methods for determination of the cardiac short axis on dogs' radiographs: the original approach published by Buchanan and the modified approach proposed by the EPIC trial authors, and only Buchanan's method for cats' radiographs. Overall, the VHS calculated by the AI, radiologist, and cardiologist had a high degree of agreement in both canine and feline patients (intraclass correlation coefficient (ICC) = 0.998). In canine patients, when comparing methods used to calculate VHS by specialists, there was also a high degree of agreement (ICC = 0.999). When evaluating specifically the results of the AI VHS vs. the two specialists' readings, the agreement was excellent for both canine (ICC = 0.998) and feline radiographs (ICC = 0.998). Performance of AI trained to locate VHS reference points agreed with manual calculation by specialists in both cats and dogs. Such a computer-aided technique might be an important asset for veterinarians in general practice to limit interobserver variability and obtain more comparable VHS reading over time.
Article
Full-text available
Deep Learning based Convolutional Neural Networks (CNNs) are the state-of-the-art machine learning technique with medical image data. They have the ability to process large amounts of data and learn image features directly from the raw data. Based on their training, these networks are ultimately able to classify unknown data and make predictions. Magnetic resonance imaging (MRI) is the imaging modality of choice for many spinal cord disorders. Proper interpretation requires time and expertise from radiologists, so there is great interest in using artificial intelligence to more quickly interpret and diagnose medical imaging data. In this study, a CNN was trained and tested using thoracolumbar MR images from 500 dogs. T1- and T2-weighted MR images in sagittal and transverse planes were used. The network was trained with unremarkable images as well as with images showing the following spinal cord pathologies: intervertebral disc extrusion (IVDE), intervertebral disc protrusion (IVDP), fibrocartilaginous embolism (FCE)/acute non-compressive nucleus pulposus extrusion (ANNPE), syringomyelia and neoplasia. 2,693 MR images from 375 dogs were used for network training. The network was tested using 7,695 MR images from 125 dogs. The network performed best in detecting IVDPs on sagittal T1-weighted images, with a sensitivity of 100% and specificity of 95.1%. The network also performed very well in detecting IVDEs, especially on sagittal T2-weighted images, with a sensitivity of 90.8% and specificity of 98.98%. The network detected FCEs and ANNPEs with a sensitivity of 62.22% and a specificity of 97.90% on sagittal T2-weighted images and with a sensitivity of 91% and a specificity of 90% on transverse T2-weighted images. In detecting neoplasms and syringomyelia, the CNN did not perform well because of insufficient training data or because the network had problems differentiating different hyperintensities on T2-weighted images and thus made incorrect predictions. This study has shown that it is possible to train a CNN in terms of recognizing and differentiating various spinal cord pathologies on canine MR images. CNNs therefore have great potential to act as a “second eye” for imagers in the future, providing a faster focus on the altered image area and thus increasing workflow in radiology.
Article
Full-text available
An artificial intelligence (AI)-based computer-aided detection (CAD) algorithm to detect some of the most common radiographic findings in the feline thorax was developed and tested. The database used for training comprised radiographs acquired at two different institutions. Only correctly exposed and positioned radiographs were included in the database used for training. The presence of several radiographic findings was recorded. Consequenly, the radiographic findings included for training were: no findings, bronchial pattern, pleural effusion, mass, alveolar pattern, pneumothorax, cardiomegaly. Multi-label convolutional neural networks (CNNs) were used to develop the CAD algorithm, and the performance of two different CNN architectures, ResNet 50 and Inception V3, was compared. Both architectures had an area under the receiver operating characteristic curve (AUC) above 0.9 for alveolar pattern, bronchial pattern and pleural effusion, an AUC above 0.8 for no findings and pneumothorax, and an AUC above 0.7 for cardiomegaly. The AUC for mass was low (above 0.5) for both architectures. No significant differences were evident in the diagnostic accuracy of either architecture.
Article
Full-text available
Purpose: This study was conducted to develop a deep learning-based automatic segmentation (DLBAS) model of head and neck organs for radiotherapy (RT) in dogs, and to evaluate the feasibility for delineating the RT planning. Materials and Methods: The segmentation indicated that there were potentially 15 organs at risk (OARs) in the head and neck of dogs. Post-contrast computed tomography (CT) was performed in 90 dogs. The training and validation sets comprised 80 CT data sets, including 20 test sets. The accuracy of the segmentation was assessed using both the Dice similarity coefficient (DSC) and the Hausdorff distance (HD), and by referencing the expert contours as the ground truth. An additional 10 clinical test sets with relatively large displacement or deformation of organs were selected for verification in cancer patients. To evaluate the applicability in cancer patients, and the impact of expert intervention, three methods–HA, DLBAS, and the readjustment of the predicted data obtained via the DLBAS of the clinical test sets (HA_DLBAS)–were compared. Results: The DLBAS model (in the 20 test sets) showed reliable DSC and HD values; it also had a short contouring time of ~3 s. The average (mean ± standard deviation) DSC (0.83 ± 0.04) and HD (2.71 ± 1.01 mm) values were similar to those of previous human studies. The DLBAS was highly accurate and had no large displacement of head and neck organs. However, the DLBAS in the 10 clinical test sets showed lower DSC (0.78 ± 0.11) and higher HD (4.30 ± 3.69 mm) values than those of the test sets. The HA_DLBAS was comparable to both the HA (DSC: 0.85 ± 0.06 and HD: 2.74 ± 1.18 mm) and DLBAS presented better comparison metrics and decreased statistical deviations (DSC: 0.94 ± 0.03 and HD: 2.30 ± 0.41 mm). In addition, the contouring time of HA_DLBAS (30 min) was less than that of HA (80 min). Conclusion: In conclusion, HA_DLBAS method and the proposed DLBAS was highly consistent and robust in its performance. Thus, DLBAS has great potential as a single or supportive tool to the key process in RT planning.
Article
Full-text available
Veterinarians use X-rays for almost all examinations of clinical fractures to determine the appropriate treatment. Before treatment, vets need to know the date of the injury, type of the broken bone, and age of the dog. The maturity of the dog and the time of the fracture affects the approach to the fracture site, the surgical procedure and needed materials. This comprehensive study has three main goals: determining the maturity of the dogs (Task 1), dating fractures (Task 2), and finally, detecting fractures of the long bones in dogs (Task 3). The most popular deep neural networks are used: AlexNet, ResNet-50 and GoogLeNet. One of the most popular machine learning algorithms, support vector machines (SVM), is used for comparison. The performance of all sub-studies is evaluated using accuracy and F1 score. Each task has been successful with different network architecture. ResNet-50, AlexNet and GoogLeNet are the most successful algorithms for the three tasks, with F1 scores of 0.75, 0.80 and 0.88, respectively. Data augmentation is performed to make models more robust, and the F1 scores of the three tasks were 0.80, 0.81, and 0.89 using ResNet-50, which is the most successful model. This preliminary work can be developed into support tools for practicing veterinarians that will make a difference in the treatment of dogs with fractured bones. Considering the lack of work in this interdisciplinary field, this paper may lead to future studies.
Article
Full-text available
To describe the computed tomographic (CT) features of focal liver lesions (FLLs) in dogs, that could enable predicting lesion histotype. Dogs diagnosed with FLLs through both CT and cytopathology and/or histopathology were retrospectively collected. Ten qualitative and 6 quantitative CT features have been described for each case. Lastly, a machine learning-based decision tree was developed to predict the lesion histotype. Four categories of FLLs - hepatocellular carcinoma (HCC, n = 13), nodular hyperplasia (NH, n = 19), other benign lesions (OBL, n = 18), and other malignant lesions (OML, n = 19) - were evaluated in 69 dogs. Five of the observed qualitative CT features resulted to be statistically significant in the distinction between the 4 categories: surface, appearance, lymph-node appearance, capsule formation, and homogeneity of contrast medium distribution. Three of the observed quantitative CT features were significantly different between the 4 categories: the Hounsfield Units (HU) of the radiologically normal liver parenchyma during the pre-contrast scan, the maximum dimension, and the ellipsoid volume of the lesion. Using the machine learning-based decision tree, it was possible to correctly classify NHs, OBLs, HCCs, and OMLs with an accuracy of 0.74, 0.88, 0.87, and 0.75, respectively. The developed decision tree could be an easy-to-use tool to predict the histotype of different FLLs in dogs. Cytology and histology are necessary to obtain the final diagnosis of the lesions.
Article
Full-text available
The interpretation of thoracic radiographs is a challenging and error-prone task for veterinarians. Despite recent advancements in machine learning and computer vision, the development of computer-aided diagnostic systems for radiographs remains a challenging and unsolved problem, particularly in the context of veterinary medicine. In this study, a novel method, based on multi-label deep convolutional neural network (CNN), for the classification of thoracic radiographs in dogs was developed. All the thoracic radiographs of dogs performed between 2010 and 2020 in the institution were retrospectively collected. Radiographs were taken with two different radiograph acquisition systems and were divided into two data sets accordingly. One data set (Data Set 1) was used for training and testing and another data set (Data Set 2) was used to test the generalization ability of the CNNs. Radiographic findings used as non mutually exclusive labels to train the CNNs were: unremarkable, cardiomegaly, alveolar pattern, bronchial pattern, interstitial pattern, mass, pleural effusion, pneumothorax, and megaesophagus. Two different CNNs, based on ResNet-50 and DenseNet-121 architectures respectively, were developed and tested. The CNN based on ResNet-50 had an Area Under the Receive-Operator Curve (AUC) above 0.8 for all the included radiographic findings except for bronchial and interstitial patterns both on Data Set 1 and Data Set 2. The CNN based on DenseNet-121 had a lower overall performance. Statistically significant differences in the generalization ability between the two CNNs were evident, with the CNN based on ResNet-50 showing better performance for alveolar pattern, interstitial pattern, megaesophagus, and pneumothorax.
Article
Full-text available
Although deep learning has been explored extensively for computer‐aided medical imaging diagnosis in human medicine, very little has been done in veterinary medicine. The goal of this retrospective, pilot project was to apply the deep learning artificial intelligence technique using thoracic radiographs for detection of canine left atrial enlargement and compare results with those of veterinary radiologist interpretations. Seven hundred ninety‐two right lateral radiographs from canine patients with thoracic radiographs and contemporaneous echocardiograms were used to train, validate, and test a convolutional neural network algorithm. The accuracy, sensitivity, and specificity for determination of left atrial enlargement were then compared with those of board‐certified veterinary radiologists as recorded on radiology reports. The accuracy, sensitivity, and specificity were 82.71%, 68.42%, and 87.09%, respectively, using an accuracy driven variant of the convolutional neural network algorithm and 79.01%, 73.68%, and 80.64%, respectively, using a sensitivity driven variant. By comparison, accuracy, sensitivity, and specificity achieved by board‐certified veterinary radiologists was 82.71%, 68.42%, and 87.09%, respectively. Although overall accuracy of the accuracy driven convolutional neural network algorithm and veterinary radiologists was identical, concordance between the two approaches was 85.19%. This study documents proof‐of‐concept for application of deep learning techniques for computer‐aided diagnosis in veterinary medicine.
Article
Full-text available
The purpose of this study was to develop a computer-aided detection (CAD) device based on convolutional neural networks (CNNs) to detect cardiomegaly from plain radiographs in dogs. Right lateral chest radiographs (n = 1,465) were retrospectively selected from archives. The radiographs were classified as having a normal cardiac silhouette (No-vertebral heart scale [VHS]-Cardiomegaly) or an enlarged cardiac silhouette (VHS-Cardiomegaly) based on the breed-specific VHS. The database was divided into a training set (1,153 images) and a test set (315 images). The diagnostic accuracy of four different CNN models in the detection of cardiomegaly was calculated using the test set. All tested models had an area under the curve >0.9, demonstrating high diagnostic accuracy. There was a statistically significant difference between Model C and the remainder models (Model A vs. Model C, P = 0.0298; Model B vs. Model C, P = 0.003; Model C vs. Model D, P = 0.0018), but there were no significant differences between other combinations of models (Model A vs. Model B, P = 0.395; Model A vs. Model D, P = 0.128; Model B vs. Model D, P = 0.373). Convolutional neural networks could therefore assist veterinarians in detecting cardiomegaly in dogs from plain radiographs.
Article
Full-text available
This conceptual paper addresses the issues of transparency as linked to artificial intelligence (AI) from socio-legal and computer scientific perspectives. Firstly, we discuss the conceptual distinction between transparency in AI and algorithmic transparency, and argue for the wider concept ‘in AI’, as a partly contested albeit useful notion in relation to transparency. Secondly, we show that transparency as a general concept is multifaceted, and of widespread theoretical use in multiple disciplines over time, particularly since the 1990s. Still, it has had a resurgence in contemporary notions of AI governance, such as in the multitude of recently published ethics guidelines on AI. Thirdly, we discuss and show the relevance of the fact that transparency expresses a conceptual metaphor of more general significance, linked to knowing, bringing positive connotations that may have normative effects to regulatory debates. Finally, we draw a possible categorisation of aspects related to transparency in AI, or what we interchangeably call AI transparency, and argue for the need of developing a multidisciplinary understanding, in order to contribute to the governance of AI as applied on markets and in society.
Article
Full-text available
This is a condensed summary of an international multisociety statement on ethics of artificial intelligence (AI) in radiology produced by the ACR, European Society of Radiology, RSNA, Society for Imaging Informatics in Medicine, European Society of Medical Imaging Informatics, Canadian Association of Radiologists, and American Association of Physicists in Medicine. AI has great potential to increase efficiency and accuracy throughout radiology, but it also carries inherent pitfalls and biases. Widespread use of AI-based intelligent and autonomous systems in radiology can increase the risk of systemic errors with high consequence and highlights complex ethical and societal issues. Currently, there is little experience using AI for patient care in diverse clinical settings. Extensive research is needed to understand how to best deploy AI in clinical practice. This statement highlights our consensus that ethical use of AI in radiology should promote well-being, minimize harm, and ensure that the benefits and harms are distributed among stakeholders in a just manner. We believe AI should respect human rights and freedoms, including dignity and privacy. It should be designed for maximum transparency and dependability. Ultimate responsibility and accountability for AI remains with its human designers and operators for the foreseeable future. The radiology community should start now to develop codes of ethics and practice for AI that promote any use that helps patients and the common good and should block use of radiology data and algorithms for financial gain without those two attributes. This article is a simultaneous joint publication in Radiology, Journal of the American College of Radiology, Canadian Association of Radiologists Journal, and Insights into Imaging. Published under a CC BY-NC-ND 4.0 license. Online supplemental material is available for this article.
Article
Full-text available
Background Chiari‐like malformation (CM) is a complex malformation of the skull and cranial cervical vertebrae that potentially results in pain and secondary syringomyelia (SM). Chiari‐like malformation‐associated pain (CM‐P) can be challenging to diagnose. We propose a machine learning approach to characterize morphological changes in dogs that may or may not be apparent to human observers. This data‐driven approach can remove potential bias (or blindness) that may be produced by a hypothesis‐driven expert observer approach. Hypothesis/Objectives To understand neuromorphological change and to identify image‐based biomarkers in dogs with CM‐P and symptomatic SM (SM‐S) using a novel machine learning approach, with the aim of increasing the understanding of these disorders. Animals Thirty‐two client‐owned Cavalier King Charles Spaniels (CKCSs; 11 controls, 10 CM‐P, 11 SM‐S). Methods Retrospective study using T2‐weighted midsagittal Digital Imaging and Communications in Medicine (DICOM) anonymized images, which then were mapped to images of an average clinically normal CKCS reference using Demons image registration. Key deformation features were automatically selected from the resulting deformation maps. A kernelized support vector machine was used for classifying characteristic localized changes in morphology. Results Candidate biomarkers were identified with receiver operating characteristic curves with area under the curve (AUC) of 0.78 (sensitivity 82%; specificity 69%) for the CM‐P biomarkers collectively and an AUC of 0.82 (sensitivity, 93%; specificity, 67%) for the SM‐S biomarkers, collectively. Conclusions and clinical importance Machine learning techniques can assist CM/SM diagnosis and facilitate understanding of abnormal morphology location with the potential to be applied to a variety of breeds and conformational diseases.
Article
Full-text available
Although decision‐making algorithms are not new to medicine, the availability of vast stores of medical data, gains in computing power, and breakthroughs in machine learning are accelerating the pace of their development, expanding the range of questions they can address, and increasing their predictive power. In many cases, however, the most powerful machine learning techniques purchase diagnostic or predictive accuracy at the expense of our ability to access “the knowledge within the machine.” Without an explanation in terms of reasons or a rationale for particular decisions in individual cases, some commentators regard ceding medical decision‐making to black box systems as contravening the profound moral responsibilities of clinicians. I argue, however, that opaque decisions are more common in medicine than critics realize. Moreover, as Aristotle noted over two millennia ago, when our knowledge of causal systems is incomplete and precarious—as it often is in medicine—the ability to explain how results are produced can be less important than the ability to produce such results and empirically verify their accuracy.
Article
Full-text available
Interpretation of increasingly complex imaging studies involves multiple intricate tasks requiring visual evaluation, cognitive processing, and decision-making. At each stage of this process, there are opportunities for error due to human factors including perceptual and ergonomic conditions. Investigation into the root causes of interpretive error in radiology first began over a century ago. In more recent work, there has been increasing recognition of the limits of human image perception and other human factors and greater acknowledgement of the role of the radiologist's environment in increasing the risk of error. This article reviews the state of research on perceptual and interpretive error in radiology. This article focuses on avenues for further error examination, and strategies for mitigating these errors are discussed. The relationship between artificial intelligence and interpretive error is also considered.
Article
Full-text available
In recent years, there has been massive progress in artificial intelligence (AI) with the development of deep neural networks, natural language processing, computer vision and robotics. These techniques are now actively being applied in healthcare with many of the health service activities currently being delivered by clinicians and administrators predicted to be taken over by AI in the coming years. However, there has also been exceptional hype about the abilities of AI with a mistaken notion that AI will replace human clinicians altogether. These perspectives are inaccurate, and if a balanced perspective of the limitations and promise of AI is taken, one can gauge which parts of the health system AI can be integrated to make a meaningful impact. The four main areas where AI would have the most influence would be: patient administration, clinical decision support, patient monitoring and healthcare interventions. This health system where AI plays a central role could be termed an AI-enabled or AI-augmented health system. In this article, we discuss how this system can be developed based on a realistic assessment of current AI technologies and predicted developments.
Article
Full-text available
Background Distinguishing between meningeal-based and intra-axial lesions by means of magnetic resonance (MR) imaging findings may occasionally be challenging. Meningiomas and gliomas account for most of the total primary brain neoplasms in dogs, and differentiating between these two forms is mandatory in choosing the correct therapy. The aims of the present study are: 1) to determine the accuracy of a deep convolutional neural network (CNN, GoogleNet) in discriminating between meningiomas and gliomas in pre- and post-contrast T1 images and T2 images; 2) to develop an image classifier, based on the combination of CNN and MRI sequence displaying the highest accuracy, to predict whether a lesion is a meningioma or a glioma. Results Eighty cases with a final diagnosis of meningioma (n = 56) and glioma (n = 24) from two different institutions were included in the study. A pre-trained CNN was retrained on our data through a process called transfer learning. To evaluate CNN accuracy in the different imaging sequences, the dataset was divided into a training, a validation and a test set. The accuracy of the CNN was calculated on the test set. The combination between post-contrast T1 images and CNN was chosen in developing the image classifier (trCNN). Ten images from challenging cases were excluded from the database in order to test trCNN accuracy; the trCNN was trained on the remainder of the dataset of post-contrast T1 images, and correctly classified all the selected images. To compensate for the imbalance between meningiomas and gliomas in the dataset, the Matthews correlation coefficient (MCC) was also calculated. The trCNN showed an accuracy of 94% (MCC = 0.88) on post-contrast T1 images, 91% (MCC = 0.81) on pre-contrast T1-images and 90% (MCC = 0.8) on T2 images. Conclusions The developed trCNN could be a reliable tool in distinguishing between different meningiomas and gliomas from MR images.
Article
Optimal magnetic resonance imaging (MRI) quality and shorter scan time are challenging to achieve in veterinary practices. Recently, deep learning-based reconstruction (DLR) has been proposed for ideal image quality. We hypothesized that DLR-based MRI will improve brain imaging quality and reduce scan time. This prospective, methods comparison study compared the MR image denoising performances of DLR and conventional methods, with the aim of reducing scan time and improving canine brain image quality. Transverse T2-weighted and fluid-attenuated inversion recovery (FLAIR) sequences of the brain were performed in 12 clinically healthy beagle dogs. Different numbers of excitations (NEX) were used to obtain the image groups NEX4, NEX2, and NEX1. DLR was applied to NEX2 and NEX1 to obtain NEX2DL and NEX1DL . The scan times were recorded, signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) were calculated for quantitative analysis. Five blinded veterinarians assessed the overall quality, contrast, and perceived SNR on four-point Likert scales. Quantitative and qualitative values were compared among the five groups. Compared with NEX4, NEX2 and NEX1 reduced scan time by 50% and 75%, respectively. The mean SNR and CNR of NEX2DL and NEX1DL were significantly superior to those of NEX4, NEX2, and NEX1 (P < 0.05). In all image quality indices, DLR-applied images for both T2-weighted and FLAIR images were significantly higher than NEX4 and NEX2DL had significantly better quality than NEX1DL for FLAIR (P < 0.05). Findings indicated that DLR reduced scan time and improved image quality compared with conventional MRI images in a sample of clinically healthy beagles.
Article
Advancements in the field of artificial intelligence (AI) are modest in veterinary medicine relative to their substantial growth in human medicine. However, interest in this field is increasing, and commercially available veterinary AI products are already on the market. In this retrospective, diagnostic accuracy study, the accuracy of a commercially available convolutional neural network AI product (Vetology AI®) is assessed on 56 thoracic radiographic studies of pulmonary nodules and masses, as well as 32 control cases. Positive cases were confirmed to have pulmonary pathology consistent with a nodule/mass either by CT, cytology, or histopathology. The AI software detected pulmonary nodules/masses in 31 of 56 confirmed cases and correctly classified 30 of 32 control cases. The AI model accuracy is 69.3%, balanced accuracy 74.6%, F1‐score 0.7, sensitivity 55.4%, and specificity 93.75%. Building on these results, both the current clinical relevance of AI and how veterinarians can be expected to use available commercial products are discussed.
Article
Purpose: Thoracic radiographs are commonly used to evaluate patients with confirmed or suspected thoracic pathology. Proper patient positioning is more challenging in canine and feline radiography than in humans due to less patient cooperation and body shape variation. Improper patient positioning during radiograph acquisition has the potential to lead to a misdiagnosis. Asymmetrical hemithoraces are one of the indications of obliquity for which we propose an automatic classification method. Approach: We propose a hemithoraces segmentation method based on convolutional neural networks and active contours. We utilized the U-Net model to segment the ribs and spine and then utilized active contours to find left and right hemithoraces. We then extracted features from the left and right hemithoraces to train an ensemble classifier, which include support vector machine, gradient boosting, and multi-layer perceptron. Five-fold cross-validation was used, thorax segmentation was evaluated by intersection over union (IoU), and symmetry classification was evaluated using precision, recall, area under curve, and F1 score. Results: Classification of symmetry for 900 radiographs reported an F1 score of 82.8%. To test the robustness of the proposed thorax segmentation method to underexposure and overexposure, we synthetically corrupted properly exposed radiographs and evaluated results using IoU. The results showed that the model's IoU for underexposure and overexposure dropped by 2.1% and 1.2%, respectively. Conclusions: Our results indicate that the proposed thorax segmentation method is robust to poor exposure radiographs. The proposed thorax segmentation method can be applied to human radiography with minimal changes.
Article
Conventional MRI features of canine gliomas subtypes and grades significantly overlap. Texture analysis (TA) quantifies image texture based on spatial arrangement of pixel intensities. Machine learning (ML) models based on MRI-TA demonstrate high accuracy in predicting brain tumor types and grades in human medicine. The aim of this retrospective, diagnostic accuracy study was to investigate the accuracy of ML-based MRI-TA in predicting canine gliomas histologic types and grades. Dogs with histopathological diagnosis of intracranial glioma and available brain MRI were included. Tumors were manually segmented across their entire volume in enhancing part, non-enhancing part, and peri-tumoral vasogenic edema in T2-weighted (T2w), T1-weighted (T1w), FLAIR, and T1w postcontrast sequences. Texture features were extracted and fed into three ML classifiers. Classifiers' performance was assessed using a leave-one-out cross-validation approach. Multiclass and binary models were built to predict histologic types (oligodendroglioma vs. astrocytoma vs. oligoastrocytoma) and grades (high vs. low), respectively. Thirty-eight dogs with a total of 40 masses were included. Machine learning classifiers had an average accuracy of 77% for discriminating tumor types and of 75.6% for predicting high-grade gliomas. The support vector machine classifier had an accuracy of up to 94% for predicting tumor types and up to 87% for predicting high-grade gliomas. The most discriminative texture features of tumor types and grades appeared related to the peri-tumoral edema in T1w images and to the non-enhancing part of the tumor in T2w images, respectively. In conclusion, ML-based MRI-TA has the potential to discriminate intracranial canine gliomas types and grades.
Article
Artificial Intelligence and machine learning are novel technologies that will change the way veterinary medicine is practiced. Exactly how this change will occur is yet to be determined, and, as is the nature with disruptive technologies, will be difficult to predict. Ushering in this new tool in a conscientious way will require knowledge of the terminology and types of AI as well as forward thinking regarding the ethical and legal implications within the profession. Developers as well as end users will need to consider the ethical and legal components alongside functional creation of algorithms in order to foster acceptance and adoption, and most importantly to prevent patient harm. There are key differences in deployment of these technologies in veterinary medicine relative to human healthcare, namely our ability to perform euthanasia, and the lack of regulatory validation to bring these technologies to market. These differences along with others create a much different landscape than AI use in human medicine, and necessitate proactive planning in order to prevent catastrophic outcomes, encourage development and adoption, and protect the profession from unnecessary liability. The authors offer that deploying these technologies prior to considering the larger ethical and legal implications and without stringent validation is putting the AI cart before the horse, and risks putting patients and the profession in harm's way.
Article
In this retrospective, analytical study, we developed a deep learning-based diagnostic model that can be applied to canine stifle joint diseases and compared its accuracy with that achieved by veterinarians to verify its potential as a reliable diagnostic method. A total of 2382 radiographs of the canine stifle joint from cooperative animal hospitals were included in a dataset. Stifle joint regions were extracted from the original images using the faster region-based convolutional neural network (R-CNN) model, and the object detection accuracy was evaluated. Four radiographic findings: patellar deviation, drawer sign, osteophyte formation, and joint effusion, were observed in the stifle joint and used to train a residual network (ResNet) classification model. Implant and growth plate groups were analyzed to compare the classification accuracy against the total dataset. All deep learning-based classification models achieved target accuracies exceeding 80%, which is comparable to or slightly less than those achieved by veterinarians. However, in the case of drawer signs, further research is necessary to improve the low sensitivity of the model. When the implant group was excluded, the classification accuracy significantly improved, indicating that the implant acted as a distraction. These results indicate that deep learning-based diagnoses can be expected to become useful diagnostic models in veterinary medicine.
Article
Splenic hemangiosarcoma has morphological similarities to benign nodular hyperplasia. Computed tomography (CT) texture analysis can analyze the texture of images that the naive human eye cannot detect. Recently, there have been attempts to incorporate CT texture analysis with artificial intelligence in human medicine. This retrospective, analytical design study aimed to assess the feasibility of CT texture analysis in splenic masses and investigate predictive biomarkers of splenic hemangiosarcoma in dogs. Parameters for dogs with hemangiosarcoma and nodular hyperplasia were compared, and an independent parameter that could differentiate between them was selected. Discriminant analysis was performed to assess the ability to discriminate the two splenic masses and compare the relative importance of the parameters. A total of 23 dogs were sampled, including 16 splenic nodular hyperplasia and seven hemangiosarcoma. In each dog, total 38 radiomic parameters were extracted from first-, second-, and higher-order matrices. Thirteen parameters had significant differences between hemangiosarcoma and nodular hyperplasia. Skewness in the first-order matrix and GLRLM_LGRE and GLZLM_ZLNU in the second, higher-order matrix were determined as independent parameters. A discriminant equation consisting of skewness, GLZLM_LGZE, and GLZLM_ZLNU was derived, and the cross-validation verification result showed an accuracy of 95.7%. Skewness was the most influential parameter for the discrimination of the two masses. The study results supported using CT texture analysis to help differentiate hemangiosarcoma from nodular hyperplasia in dogs. This new diagnostic approach can be used for developing future machine learning-based texture analysis tools.
Article
Convolutional neural networks (CNNs) are commonly used as artificial intelligence (AI) tools for evaluating radiographs, but published studies testing their performance in veterinary patients are currently lacking. The purpose of this retrospective, secondary analysis, diagnostic accuracy study was to compare the error rates of four CNNs to the error rates of 13 veterinary radiologists for evaluating canine thoracic radiographs using an independent gold standard. Radiographs acquired at a referral institution were used to evaluate the four CNNs sharing a common architecture. Fifty radiographic studies were selected at random. The studies were evaluated independently by three board‐certified veterinary radiologists for the presence or absence of 15 thoracic labels, thus creating the gold standard through the majority rule. The labels included “cardiovascular,” “pulmonary,” “pleural,” “airway,” and “other categories.” The error rates for each of the CNNs and for 13 additional board‐certified veterinary radiologists were calculated on those same studies. There was no statistical difference in the error rates among the four CNNs for the majority of the labels. However, the CNN's training method impacted the overall error rate for three of 15 labels. The veterinary radiologists had a statistically lower error rate than all four CNNs overall and for five labels (33%). There was only one label (“esophageal dilation”) for which two CNNs were superior to the veterinary radiologists. Findings from the current study raise numerous questions that need to be addressed to further develop and standardize AI in the veterinary radiology environment and to optimize patient care.
Article
Application of artificial intelligence (AI) to improve clinical diagnosis is a burgeoning field in human and veterinary medicine. The objective of this prospective, diagnostic accuracy study was to determine the accuracy, sensitivity, and specificity of an AI-based software for diagnosing canine cardiogenic pulmonary edema from thoracic radiographs, using an American College of Veterinary Radiology-certified veterinary radiologist's interpretation as the reference standard. Five hundred consecutive canine thoracic radiographs made after-hours by a veterinary Emergency Department were retrieved. A total of 481 of 500 cases were technically analyzable. Based on the radiologist's assessment, 46 (10.4%) of these 481 dogs were diagnosed with cardiogenic pulmonary edema (CPE+). Of these cases, the AI software designated 42 of 46 as CPE+ and four of 46 as cardiogenic pulmonary edema negative (CPE-). Accuracy, sensitivity, and specificity of the AI-based software compared to radiologist diagnosis were 92.3%, 91.3%, and 92.4%, respectively (positive predictive value, 56%; negative predictive value, 99%). Findings supported using AI software screening for thoracic radiographs of dogs with suspected cardiogenic pulmonary edema to assist with short-term decision-making when a radiologist is unavailable.
Article
Our title alludes to the three Christmas ghosts encountered by Ebenezer Scrooge in A Christmas Carol, who guide Ebenezer through the past, present, and future of Christmas holiday events. Similarly, our article takes readers through a journey of the past, present, and future of medical AI. In doing so, we focus on the crux of modern machine learning: the reliance on powerful but intrinsically opaque models. When applied to the healthcare domain, these models fail to meet the needs for transparency that their clinician and patient end-users require. We review the implications of this failure, and argue that opaque models (1) lack quality assurance, (2) fail to elicit trust, and (3) restrict physician-patient dialogue. We then discuss how upholding transparency in all aspects of model design and model validation can help ensure the reliability and success of medical AI.
Article
Tumor heterogeneity is a well-established marker of biologically aggressive neoplastic processes and is associated with local recurrence and distant metastasis. Quantitative analysis of CT textural features is an indirect measure of tumor heterogeneity and therefore may help predict malignant disease. The purpose of this retrospective, secondary analysis study was to quantitatively evaluate CT heterogeneity in dogs with histologically confirmed liver masses to build a predictive model for malignancy. Forty dogs with liver tumors and corresponding histopathologic evaluation from a previous prospective study were included. Triphasic image acquisition was standardized across dogs and whole liver and liver mass were contoured on each precontrast and delayed postcontrast dataset. First-order and second-order indices were extracted from contoured regions. Univariate analysis identified potentially significant indices that were subsequently used for top-down model construction. Multiple quadratic discriminatory models were constructed and tested, including individual models using both postcontrast and precontrast whole liver or liver mass volumes. The best performing model utilized the CT features voxel volume and uniformity from postcontrast mass contours; this model had an accuracy of 0.90, sensitivity of 0.67, specificity of 1.0, positive predictive value of 1.0, negative predictive value of 0.88, and precision of 1.0. Heterogeneity indices extracted from delayed postcontrast CT hepatic mass contours were more informative about tumor type compared to indices from whole liver contours, or from precontrast hepatic mass and whole liver contours. Results demonstrate that CT radiomic feature analysis may hold clinical utility as a noninvasive method of predicting hepatic malignancy and may influence diagnostic or therapeutic approaches.
Article
Cardiomegaly is the main imaging finding for canine heart diseases. There are many advances in the field of medical diagnosing based on imaging with deep learning for human being. However there are also increasing realization of the potential of using deep learning in veterinary medicine. We reported a clinically applicable assisted platform for diagnosing the canine cardiomegaly with deep learning. VHS (vertebral heart score) is a measuring method used for the heart size of a dog. The concrete value of VHS is calculated with the relative position of 16 key points detected by the system, and this result is then combined with VHS reference range of all dog breeds to assist in the evaluation of the canine cardiomegaly. We adopted HRNet (high resolution network) to detect 16 key points (12 and four key points located on vertebra and heart respectively) in 2274 lateral X-ray images (training and validation datasets) of dogs, the model was then used to detect the key points in external testing dataset (396 images), the AP (average performance) for key point detection reach 86.4 %. Then we applied an additional post processing procedure to correct the output of HRNets so that the AP reaches 90.9 %. This result signifies that this system can effectively assist the evaluation of canine cardiomegaly in a real clinical scenario.
Article
Coccidioidomycosis is the most common systemic mycosis in dogs in the southwestern United States. With warming climates, affected areas and number of cases are expected to increase in the coming years, escalating also the chances of transmission to humans. As a result, developing methods for automating the detection of the disease is important, as this will help doctors and veterinarians more easily identify and diagnose positive cases. We apply machine learning models to provide accurate and interpretable predictions of Coccidioidomycosis. We assemble a set of radiographic images and use it to train and test state-of-the-art convolutional neural networks to detect Coccidioidomycosis. These methods are relatively inexpensive to train and very fast at inference time. We demonstrate the successful application of this approach to detect the disease with an Area Under the Curve (AUC) above 0.99 using 10-fold cross-validation. We also use the classification model to identify regions of interest and localize the disease in the radiographic images, as illustrated through visual heatmaps. This proof-of-concept study establishes the feasibility of very accurate and rapid automated detection of Valley Fever in radiographic images.
Article
Reports of machine learning implementations in veterinary imaging are infrequent but changes in machine learning architecture and access to increased computing power will likely prompt increased interest. This diagnostic accuracy study describes a particular form of machine learning, a deep learning convolution neural network (ConvNet) for hip joint detection and classification of hip dysplasia from ventro‐dorsal (VD) pelvis radiographs submitted for hip dysplasia screening. 11,759 pelvis images were available together with their Fédération Cynologique Internationale (FCI) scores. The dataset was dicotomized into images showing no signs of hip dysplasia (FCI grades “A” and “B”, the “A‐B” group) and hips showing signs of dysplasia (FCI grades “C”, “D,” and “E”, the “C‐E” group). In a transfer learning approach, an existing pretrained ConvNet was fine‐tuned to provide models to recognize hip joints in VD pelvis images and to classify them according to their FCI score grouping. The results yielded two models. The first was successful in detecting hip joints in the VD pelvis images (intersection over union of 85%). The second yielded a sensitivity of 0.53, a specificity of 0.92, a positive predictive value of 0.91, and a negative predictive value of 0.81 for the classification of detected hip joints as being in the “C‐E” group. ConvNets and transfer learning are applicable to veterinary imaging. The models obtained have potential to be a tool to aid in hip screening protocols if hip dysplasia classification performance was improved through access to more data and possibly by model optimization.
Article
Magnetic resonance imaging is the primary method used to diagnose canine glial cell neoplasia and noninfectious inflammatory meningoencephalitis. Subjective differentiation of these diseases can be difficult due to overlapping imaging characteristics. This study utilizes texture analysis (TA) of intra‐axial lesions both as a means to quantitatively differentiate these broad categories of disease and to help identify glial tumor grade/cell type and specific meningoencephalitis subtype in a group of 119 dogs with histologically confirmed diagnoses. Fifty‐nine dogs with gliomas and 60 dogs with noninfectious inflammatory meningoencephalitis were retrospectively recruited and randomly split into training (n = 80) and test (n = 39) cohorts. Forty‐five of 120 texture metrics differed significantly between cohorts after correcting for multiple testing (false discovery rate < 0.05). After training the random forest algorithm, the classification accuracy for the test set was 85% (sensitivity 89%, specificity 81%). TA was only partially able to differentiate the inflammatory subtypes (granulomatous meningoencephalitis [GME], necrotizing meningoencephalitis [NME], and necrotizing leukoencephalitis [NLE]) (out‐of‐bag error rate of 35.0%) and was unable to identify metrics that could correctly classify glioma grade or cell type (out‐of‐bag error rate of 59.6% and 47.5%, respectively). Multiple demographic differences, such as patient age, sex, weight, and breed were identified between disease cohorts and subtypes which may be useful in prioritizing differential diagnoses. TA of MR images with a random forest algorithm provided classification accuracy of inflammatory and neoplastic brain disease approaching the accuracy of previously reported subjective radiologist evaluation.
Article
To date, deep learning technologies have provided powerful decision support systems to radiologists in human medicine. The aims of this retrospective, exploratory study were to develop and describe an artificial intelligence able to screen thoracic radiographs for primary thoracic lesions in feline and canine patients. Three deep learning networks using three different pretraining strategies to predict 15 types of primary thoracic lesions were created (including tracheal collapse, left atrial enlargement, alveolar pattern, pneumothorax, and pulmonary mass). Upon completion of pretraining, the algorithms were provided with over 22 000 thoracic veterinary radiographs for specific training. All radiographs had a report created by a board‐certified veterinary radiologist used as the gold standard. The performances of all three networks were compared to one another. An additional 120 radiographs were then evaluated by three types of observers: the best performing network, veterinarians, and veterinarians aided by the network. The error rates for each of the observers was calculated as an overall and for the 15 labels and were compared using a McNemar's test. The overall error rate of the network was significantly better than the overall error rate of the veterinarians or the veterinarians aided by the network (10.7% vs 16.8% vs17.2%, P = .001). The network's error rate was significantly better to detect cardiac enlargement and for bronchial pattern. The current network only provides help in detecting various lesion types and does not provide a diagnosis. Based on its overall very good performance, this could be used as an aid to general practitioners while waiting for the radiologist's report.
Article
This article presents a mapping review of the literature concerning the ethics of artificial intelligence (AI) in health care. The goal of this review is to summarise current debates and identify open questions for future research. Five literature databases were searched to support the following research question: how can the primary ethical risks presented by AI-health be categorised, and what issues must policymakers, regulators and developers consider in order to be ‘ethically mindful?. A series of screening stages were carried out—for example, removing articles that focused on digital health in general (e.g. data sharing, data access, data privacy, surveillance/nudging, consent, ownership of health data, evidence of efficacy)—yielding a total of 156 papers that were included in the review. We find that ethical issues can be (a) epistemic, related to misguided, inconclusive or inscrutable evidence; (b) normative, related to unfair outcomes and transformative effectives; or (c) related to traceability. We further find that these ethical issues arise at six levels of abstraction: individual, interpersonal, group, institutional, and societal or sectoral. Finally, we outline a number of considerations for policymakers and regulators, mapping these to existing literature, and categorising each as epistemic, normative or traceability-related and at the relevant level of abstraction. Our goal is to inform policymakers, regulators and developers of what they must consider if they are to enable health and care systems to capitalise on the dual advantage of ethical AI; maximising the opportunities to cut costs, improve care, and improve the efficiency of health and care systems, whilst proactively avoiding the potential harms. We argue that if action is not swiftly taken in this regard, a new ‘AI winter’ could occur due to chilling effects related to a loss of public trust in the benefits of AI for health care.
Article
The use of artificial intelligence, and the deep-learning subtype in particular, has been enabled by the use of labeled big data, along with markedly enhanced computing power and cloud storage, across all sectors. In medicine, this is beginning to have an impact at three levels: for clinicians, predominantly via rapid, accurate image interpretation; for health systems, by improving workflow and the potential for reducing medical errors; and for patients, by enabling them to process their own data to promote health. The current limitations, including bias, privacy and security, and lack of transparency, along with the future directions of these applications will be discussed in this article. Over time, marked improvements in accuracy, productivity, and workflow will likely be actualized, but whether that will be used to improve the patient–doctor relationship or facilitate its erosion remains to be seen.