To ensure the reliable use of classification systems in medical applications, it is crucial to prevent silent failures. This can be achieved by either designing classifiers that are robust enough to avoid failures in the first place, or by detecting remaining failures using confidence scoring functions (CSFs). A predominant source of failures in image classification is distribution shifts between training data and deployment data. To understand the current state of silent failure prevention in medical imaging, we conduct the first comprehensive analysis comparing various CSFs in four biomedical tasks and a diverse range of distribution shifts. Based on the result that none of the benchmarked CSFs can reliably prevent silent failures, we conclude that a deeper understanding of the root causes of failures in the data is required. To facilitate this, we introduce SF-Visuals, an interactive analysis tool that uses latent space clustering to visualize shifts and failures. On the basis of various examples, we demonstrate how this tool can help researchers gain insight into the requirements for safe application of classification systems in the medical domain. The open-source benchmark and tool are at: https://github.com/IML-DKFZ/sf-visuals.
There has been exploding interest in embracing Transformer-based architectures for medical image segmentation. However, the lack of large-scale annotated medical datasets make achieving performances equivalent to those in natural images challenging. Convolutional networks, in contrast, have higher inductive biases and consequently, are easily trainable to high performance. Recently, the ConvNeXt architecture attempted to modernize the standard ConvNet by mirroring Transformer blocks. In this work, we improve upon this to design a modernized and scalable convolutional architecture customized to challenges of data-scarce medical settings. We introduce MedNeXt, a Transformer-inspired large kernel segmentation network which introduces – 1) A fully ConvNeXt 3D Encoder-Decoder Network for medical image segmentation, 2) Residual ConvNeXt up and downsampling blocks to preserve semantic richness across scales, 3) A novel technique to iteratively increase kernel sizes by upsampling small kernel networks, to prevent performance saturation on limited medical data, 4) Compound scaling at multiple levels (depth, width, kernel size) of MedNeXt. This leads to state-of-the-art performance on 4 tasks on CT and MRI modalities and varying dataset sizes, representing a modernized deep architecture for medical image segmentation. Our code is made publicly available at: https://github.com/MIC-DKFZ/MedNeXt.
Data augmentation (DA) is a key factor in medical image analysis, such as in prostate cancer (PCa) detection on magnetic resonance images. State-of-the-art computer-aided diagnosis systems still rely on simplistic spatial transformations to preserve the pathological label post transformation. However, such augmentations do not substantially increase the organ as well as tumor shape variability in the training set, limiting the model’s ability to generalize to unseen cases with more diverse localized soft-tissue deformations. We propose a new anatomy-informed transformation that leverages information from adjacent organs to simulate typical physiological deformations of the prostate and generates unique lesion shapes without altering their label. Due to its lightweight computational requirements, it can be easily integrated into common DA frameworks. We demonstrate the effectiveness of our augmentation on a dataset of 774 biopsy-confirmed examinations, by evaluating a state-of-the-art method for PCa detection with different augmentation settings.
Domain gaps are among the most relevant roadblocks in the clinical translation of machine learning (ML)-based solutions for medical image analysis. While current research focuses on new training paradigms and network architectures, little attention is given to the specific effect of prevalence shifts on an algorithm deployed in practice. Such discrepancies between class frequencies in the data used for a method’s development/validation and that in its deployment environment(s) are of great importance, for example in the context of artificial intelligence (AI) democratization, as disease prevalences may vary widely across time and location. Our contribution is twofold. First, we empirically demonstrate the potentially severe consequences of missing prevalence handling by analyzing (i) the extent of miscalibration, (ii) the deviation of the decision threshold from the optimum, and (iii) the ability of validation metrics to reflect neural network performance on the deployment population as a function of the discrepancy between development and deployment prevalence. Second, we propose a workflow for prevalence-aware image classification that uses estimated deployment prevalences to adjust a trained classifier to a new environment, without requiring additional annotated deployment data. Comprehensive experiments based on a diverse set of 30 medical classification tasks showcase the benefit of the proposed workflow in generating better classifier decisions and more reliable performance estimates compared to current practice.
Synthetic medical image generation has evolved as a key technique for neural network training and validation. A core challenge, however, remains in the domain gap between simulations and real data. While deep learning-based domain transfer using Cycle Generative Adversarial Networks and similar architectures has led to substantial progress in the field, there are use cases in which state-of-the-art approaches still fail to generate training images that produce convincing results on relevant downstream tasks. Here, we address this issue with a domain transfer approach based on conditional invertible neural networks (cINNs). As a particular advantage, our method inherently guarantees cycle consistency through its invertible architecture, and network training can efficiently be conducted with maximum likelihood training. To showcase our method’s generic applicability, we apply it to two spectral imaging modalities at different scales, namely hyperspectral imaging (pixel-level) and photoacoustic tomography (image-level). According to comprehensive experiments, our method enables the generation of realistic spectral data and outperforms the state of the art on two downstream classification tasks (binary and multi-class). cINN-based domain transfer could thus evolve as an important method for realistic synthetic data generation in the field of spectral imaging and beyond. The code is available at https://github.com/IMSY-DKFZ/UDT-cINN .
Classification of heterogeneous diseases is challenging due to their complexity, variability of symptoms and imaging findings. Chronic Obstructive Pulmonary Disease (COPD) is a prime example, being underdiagnosed despite being the third leading cause of death. Its sparse, diffuse and heterogeneous appearance on computed tomography challenges supervised binary classification. We reformulate COPD binary classification as an anomaly detection task, proposing cOOpD: heterogeneous pathological regions are detected as Out-of-Distribution (OOD) from normal homogeneous lung regions. To this end, we learn representations of unlabeled lung regions employing a self-supervised contrastive pretext model, potentially capturing specific characteristics of diseased and healthy unlabeled regions. A generative model then learns the distribution of healthy representations and identifies abnormalities (stemming from COPD) as deviations. Patient-level scores are obtained by aggregating region OOD scores. We show that cOOpD achieves the best performance on two public datasets, with an increase of 8.2% and 7.7% in terms of AUROC compared to the previous supervised state-of-the-art. Additionally, cOOpD yields well-interpretable spatial anomaly maps and patient-level scores which we show to be of additional value in identifying individuals in the early stage of progression. Experiments in artificially designed real-world prevalence settings further support that anomaly detection is a powerful way of tackling COPD classification. Code is at https://github.com/MIC-DKFZ/cOOpD.
Surgical scene understanding is a key prerequisite for context-aware decision support in the operating room. While deep learning-based approaches have already reached or even surpassed human performance in various fields, the task of surgical action recognition remains a major challenge. With this contribution, we are the first to investigate the concept of self-distillation as a means of addressing class imbalance and potential label ambiguity in surgical video analysis. Our proposed method is a heterogeneous ensemble of three models that use Swin Transformers as backbone and the concepts of self-distillation and multi-task learning as core design choices. According to ablation studies performed with the CholecT45 challenge data via cross-validation, the biggest performance boost is achieved by the usage of soft labels obtained by self-distillation. External validation of our method on an independent test set was achieved by providing a Docker container of our inference model to the challenge organizers. According to their analysis, our method outperforms all other solutions submitted to the latest challenge in the field. Our approach thus shows the potential of self-distillation for becoming an important tool in medical image analysis applications. Code available at https://github.com/IMSY-DKFZ/self-distilled-swin.
Introduction Cancer-related fatigue (CRF) is a frequent and burdensome sequela of cancer and cancer therapies. It can persist from months to years and has a substantial impact on patients’ quality of life and functioning. CRF is often still not adequately diagnosed and insufficiently treated. According to guideline recommendations, patients should be routinely screened for CRF from cancer diagnosis onwards. We will investigate how an effective screening should be designed regarding timing, frequency, screening type and cut-off points. Methods and analysis MERLIN is a longitudinal observational study that will include 300 patients with cancer at the beginning of cancer therapy. The main study centre is the National Center for Tumour Diseases Heidelberg, Germany. Patients answer five items to shortly screen for CRF at high frequency during their therapy and at lower frequency during the post-treatment phase for 18 months. Further, CRF is assessed at wider intervals based on the Cella criteria, the Brief Fatigue Inventory impact scale, the quality of life fatigue questionnaire (QLQ-FA12) and the fatigue and cognitive items of the quality of life core questionnaire (QLQ-C30), both of the European Organisation for Research and Treatment of Cancer. Important psychological, socio-demographical or medical factors, which may exacerbate CRF are assessed. All assessments are performed online. Receiver operating curves, areas under the curve, sensitivity, specificity, positive and negative predictive values and likelihood ratios will be calculated to determine optimal short screening modalities. Ethics and dissemination The study was approved by the ethics committee of the Medical Faculty of the Heidelberg University, Germany (approval number: S-336/2022). Written informed consent is obtained from all participants. The study is conducted in full conformance with the principles of the Declaration of Helsinki. Results will be published in peer-reviewed scientific journals, presented at conferences and communicated to clinical stakeholders to foster the implementation of an effective CRF management. Trial registration number ClinicalTrials.gov; registration number: NCT05448573 .
Background: Extracellular vesicles (EVs) and non-coding RNAs (ncRNAs) are emerging contributors to Alzheimer’s disease (AD) pathophysiology. Differential abundance of ncRNAs carried by EVs may provide valuable insights into underlying disease mechanisms. Brain tissue-derived EVs (bdEVs) are particularly relevant, as they may offer valuable insights about the tissue of origin. However, there is limited research on diverse ncRNA species in bdEVs in AD. Objective: This study explored whether the non-coding RNA composition of EVs isolated from post-mortem brain tissue is related to AD pathogenesis. Methods: bdEVs from age-matched late-stage AD patients (n = 23) and controls (n = 10) that had been separated and characterized in our previous study were used for RNA extraction, small RNA sequencing, and qPCR verification. Results: Significant differences of non-coding RNAs between AD and controls were found, especially for miRNAs and tRNAs. AD pathology-related miRNA and tRNA differences of bdEVs partially matched expression differences in source brain tissues. AD pathology had a more prominent association than biological sex with bdEV miRNA and tRNA components in late-stage AD brains. Conclusions: Our study provides further evidence that EV non-coding RNAs from human brain tissue, including but not limited to miRNAs, may be altered and contribute to AD pathogenesis.
High, sustainable MRD negativity rates with Isa-KRd in newly diagnosed high-risk multiple myeloma (CONCEPT trial)
Background Growing challenges in oncology require evolving educational methods and content. International efforts to reform oncology education are underway. Hands-on, interdisciplinary, and compact course formats have shown great effectiveness in the education of medical students. Our aim was to establish a new interdisciplinary one-week course on the principles of oncology using state-of-the-art teaching methods. Methods In an initial survey, medical students of LMU Munich were questioned about their current level of knowledge on the principles of oncology. In a second two-stage survey, the increase in knowledge resulting from our recently established interdisciplinary one-week course was determined. Results The medical students’ knowledge of clinically important oncological topics, such as the diagnostic workup and interdisciplinary treatment options, showed a need for improvement. Knowledge of the major oncological entities was also in an expandable state. By attending the one-week course on the principles of oncology, students improved their expertise in all areas of the clinical workup in oncology and had the opportunity to close previous knowledge gaps. In addition, students were able to gain more in-depth clinical knowledge on the most common oncological entities. Conclusion The interdisciplinary one-week course on the principles of oncology proved to be an effective teaching method to expand the knowledge of the future physicians to an appropriate level. With its innovative and interdisciplinary approach, the one-week course could be used as a showcase project for the ongoing development of medical education in Germany.
Virus-host interactions can reveal potentially effective and selective therapeutic targets for treating infection. Here, we performed an integrated analysis of the dynamics of virus replication and the host cell transcriptional response to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection using human Caco-2 colon cancer cells as a model. Time-resolved RNA sequencing revealed that, upon infection, cells immediately transcriptionally activated genes associated with inflammatory pathways that mediate the antiviral response, which was followed by an increase in the expression of genes involved in ribosome and mitochondria function, thus suggesting rapid alterations in protein production and cellular energy supply. At later stages, between 24 and 48 hours after infection, the expression of genes involved in metabolic processes—in particular, those related to xenobiotic metabolism—was decreased. Mathematical modeling incorporating SARS-CoV-2 replication suggested that SARS-CoV-2 proteins inhibited the host antiviral response and that virus transcripts exceeded the translation capacity of the host cells. Targeting kinase-dependent pathways that exhibited increases in transcription in host cells was as effective as a virus-targeted inhibitor at repressing viral replication. Our findings in this model system delineate a sequence of SARS-CoV-2 virus-host interactions that may facilitate the identification of druggable host pathways to suppress infection.
Aim The optimal management for early recurrent prostate cancer following radical prostatectomy (RP) in patients with negative prostate-specific membrane antigen positron-emission tomography (PSMA-PET) scan is an ongoing subject of debate. The aim of this study was to evaluate the outcome of salvage radiotherapy (SRT) in patients with biochemical recurrence with negative PSMA PET finding. Methods This retrospective, multicenter (11 centers, 5 countries) analysis included patients who underwent SRT following biochemical recurrence (BR) of PC after RP without evidence of disease on PSMA-PET staging. Biochemical recurrence-free survival (bRFS), metastatic-free survival (MFS) and overall survival (OS) were assessed using Kaplan-Meier method. Multivariable Cox proportional hazards regression assessed predefined predictors of survival outcomes. Results Three hundred patients were included, 253 (84.3%) received SRT to the prostate bed only, 46 (15.3%) additional elective pelvic nodal irradiation, respectively. Only 41 patients (13.7%) received concomitant androgen deprivation therapy (ADT). Median follow-up after SRT was 33 months (IQR: 20–46 months). Three-year bRFS, MFS, and OS following SRT were 73.9%, 87.8%, and 99.1%, respectively. Three-year bRFS was 77.5% and 48.3% for patients with PSA levels before PSMA-PET ≤ 0.5 ng/ml and > 0.5 ng/ml, respectively. Using univariate analysis, the International Society of Urological Pathology (ISUP) grade > 2 (p = 0.006), metastatic pelvic lymph nodes at surgery (p = 0.032), seminal vesicle involvement (p < 0.001), pre-SRT PSA level of > 0.5 ng/ml (p = 0.004), and lack of concomitant ADT (p = 0.023) were significantly associated with worse bRFS. On multivariate Cox proportional hazards, seminal vesicle infiltration (p = 0.007), ISUP score >2 (p = 0.048), and pre SRT PSA level > 0.5 ng/ml (p = 0.013) remained significantly associated with worse bRFS. Conclusion Favorable bRFS after SRT in patients with BR and negative PSMA-PET following RP was achieved. These data support the usage of early SRT for patients with negative PSMA-PET findings.
Diffuse gliomas in adults encompass a heterogenous group of central nervous system neoplasms. In recent years, extensive (epi-)genomic profiling has identified several glioma subgroups characterized by distinct molecular characteristics, most importantly IDH1/2 and histone H3 mutations. A group of 16 diffuse gliomas classified as “adult-type diffuse high-grade glioma, IDH-wildtype, subtype F (HGG-F)” was identified by the DKFZ v12.5 Brain Tumor Classifier . Histopathologic characterization, exome sequencing, and review of clinical data was performed in all cases. Based on unsupervised t -distributed stochastic neighbor embedding and clustering analysis of genome-wide DNA methylation data, HGG-F shows distinct epigenetic profiles separate from established central nervous system tumors. Exome sequencing demonstrated frequent TERT promoter (12/15 cases), PIK3R1 (11/16), and TP53 mutations (5/16). Radiologic characteristics were reminiscent of gliomatosis cerebri in 9/14 cases (64%). Histopathologically, most cases were classified as diffuse gliomas (7/16, 44%) or were suspicious for the infiltration zone of a diffuse glioma (5/16, 31%). None of the cases demonstrated microvascular proliferation or necrosis. Outcome of 14 patients with follow-up data was better compared to IDH-wildtype glioblastomas with a median progression-free survival of 58 months and overall survival of 74 months (both P <0.0001). Our series represents a novel type of adult-type diffuse glioma with distinct molecular and clinical features. Importantly, we provide evidence that TERT promoter mutations in diffuse gliomas without further morphologic or molecular signs of high-grade glioma should be interpreted in the context of the clinicoradiologic presentation as well as epigenetic profile and may not be suitable as a standalone marker for glioblastoma, IDH-wildtype.
Objective First implementation of dynamic oxygen-17 ( ¹⁷ O) MRI at 7 Tesla (T) during neuronal stimulation in the human brain. Methods Five healthy volunteers underwent a three-phase ¹⁷ O gas ( ¹⁷ O 2 ) inhalation experiment. Combined right-side visual stimulus and right-hand finger tapping were used to achieve neuronal stimulation in the left cerebral hemisphere. Data analysis included the evaluation of the relative partial volume (PV)-corrected time evolution of absolute ¹⁷ O water (H 2 ¹⁷ O) concentration and of the relative signal evolution without PV correction. Statistical analysis was performed using a one-tailed paired t test. Blood oxygen level-dependent (BOLD) experiments were performed to validate the stimulation paradigm. Results The BOLD maps showed significant activity in the stimulated left visual and sensorimotor cortex compared to the non-stimulated right side. PV correction of ¹⁷ O MR data resulted in high signal fluctuations with a noise level of 10% due to small regions of interest (ROI), impeding further quantitative analysis. Statistical evaluation of the relative H 2 ¹⁷ O signal with PV correction ( p = 0.168) and without ( p = 0.382) did not show significant difference between the stimulated left and non-stimulated right sensorimotor ROI. Discussion The change of cerebral oxygen metabolism induced by sensorimotor and visual stimulation is not large enough to be reliably detected with the current setup and methodology of dynamic ¹⁷ O MRI at 7 T.
Objectives The subendocardial viability ratio (SEVR) reflects the balance of myocardial oxygen supply and demand. Low SEVR indicates a reduced subendocardial perfusion and has been shown to predict mortality in patients with kidney disease and diabetes. The aim of this study is to investigate the association of SEVR and mortality in the elderly population. Methods We analysed data from the CARdiovascular disease, Living and Ageing in Halle (CARLA) study. SEVR was estimated noninvasively by radial artery tonometry and brachial blood pressure measurement. The study population was divided into a low (SEVR ≤130%) and normal (SEVR >130%) SEVR group. Cox-regression was used for survival analysis. Results In total, 1414 participants (635 women, 779 men) aged from 50 to 87 years (mean age 67.3 years) were included in the analysis. The all-cause mortality was 22.7% during a median follow-up of 10.5 years. The unadjusted association of SEVR with all-cause mortality decreased from 3.52 (1.31–9.46) [hazard ratio (95% confidence interval) for low SEVR ≤ 130% versus normal SEVR > 130%] among those younger than 60 years to 0.86 (0.50–1.48) among those older than 80 years and from 1.81 (0.22–14.70) to 0.75 (0.30–1.91) for cardiovascular mortality. Sex-specific unadjusted analyses demonstrated an association of SEVR with all-cause and cardiovascular mortality in men [2.32 (1.61–3.34) and 2.24 (1.18–4.24)], but not in women [1.53 (0.87–2.72) and 1.14 (0.34–3.82)]. Conclusion Our data suggests that SEVR is an age dependent predictor for all-cause mortality, predominantly in men younger than 60 years.
Background Small bowel malperfusion (SBM) can cause high morbidity and severe surgical consequences. However, there is no standardized objective measuring tool for the quantification of SBM. Indocyanine green (ICG) imaging can be used for visualization, but lacks standardization and objectivity. Hyperspectral imaging (HSI) as a newly emerging technology in medicine might present advantages over conventional ICG fluorescence or in combination with it. Methods HSI baseline data from physiological small bowel, avascular small bowel and small bowel after intravenous application of ICG was recorded in a total number of 54 in-vivo pig models. Visualizations of avascular small bowel after mesotomy were compared between HSI only (1), ICG-augmented HSI (IA-HSI) (2), clinical evaluation through the eyes of the surgeon (3) and conventional ICG-imaging (4). The primary research focus was the localization of resection borders as suggested by each of the 4 methods. Distances between these borders were measured and histological samples were obtained from the regions in between in order to quantify necrotic changes 6 hours after mesotomy for every region. Results StO 2 images (1) were capable of visualizing areas of physiological perfusion and areas of clearly impaired perfusion. However, exact borders where physiological perfusion started to decrease could not be clearly identified. Instead, IA-HSI (2) suggested a sharp resection line where StO 2 values started to decrease. Clinical evaluation (3) suggested a resection line 23 mm (±7 mm) and conventional ICG-imaging (4) even suggested a resection line 53 mm (±13 mm) closer towards the malperfused region. Histopathological evaluation of the region that was sufficiently perfused only according to conventional ICG (R3) already revealed a significant increase in pre-necrotic changes in 27% (±9%) of surface area. Therefore, conventional ICG seems less sensitive than IA-HSI with regards to detection of insufficient tissue perfusion. Conclusions In this experimental animal study, IA-HSI (2) was superior for the visualization of segmental SBM compared to conventional HSI imaging (1), clinical evaluation (3) or conventional ICG imaging (4) regarding histopathological safety. ICG application caused visual artifacts in the StO 2 values of the HSI camera as values significantly increase. This is caused by optical properties of systemic ICG and does not resemble a true increase in oxygenation levels. However, this empirical finding can be used to visualize segmental SBM utilizing ICG as contrast agent in an approach for IA-HSI. Clinical applicability and relevance will have to be explored in clinical trials. Level of Evidence Not applicable. Translational animal science. Original article.
A large number of variants identified through clinical genetic testing in disease susceptibility genes are of uncertain significance (VUS). Following the recommendations of the American College of Medical Genetics and Genomics (ACMG) and Association for Molecular Pathology (AMP), the frequency in case-control datasets (PS4 criterion) can inform their interpretation. We present a novel case-control likelihood ratio-based method that incorporates gene-specific age-related penetrance. We demonstrate the utility of this method in the analysis of simulated and real datasets. In the analysis of simulated data, the likelihood ratio method was more powerful compared to other methods. Likelihood ratios were calculated for a case-control dataset of BRCA1 and BRCA2 variants from the Breast Cancer Association Consortium (BCAC) and compared with logistic regression results. A larger number of variants reached evidence in favor of pathogenicity, and a substantial number of variants had evidence against pathogenicity—findings that would not have been reached using other case-control analysis methods. Our novel method provides greater power to classify rare variants compared with classical case-control methods. As an initiative from the ENIGMA Analytical Working Group, we provide user-friendly scripts and preformatted Excel calculators for implementation of the method for rare variants in BRCA1, BRCA2, and other high-risk genes with known penetrance.
Objective Pancreatic ductal adenocarcinoma (PDAC) is a lethal malignancy. Differentiation from chronic pancreatitis (CP) is currently inaccurate in about one-third of cases. Misdiagnoses in both directions, however, have severe consequences for patients. We set out to identify molecular markers for a clear distinction between PDAC and CP. Design Genome-wide variations of DNA-methylation, messenger RNA and microRNA level as well as combinations thereof were analysed in 345 tissue samples for marker identification. To improve diagnostic performance, we established a random-forest machine-learning approach. Results were validated on another 48 samples and further corroborated in 16 liquid biopsy samples. Results Machine-learning succeeded in defining markers to differentiate between patients with PDAC and CP, while low-dimensional embedding and cluster analysis failed to do so. DNA-methylation yielded the best diagnostic accuracy by far, dwarfing the importance of transcript levels. Identified changes were confirmed with data taken from public repositories and validated in independent sample sets. A signature of six DNA-methylation sites in a CpG-island of the protein kinase C beta type gene achieved a validated diagnostic accuracy of 100% in tissue and in circulating free DNA isolated from patient plasma. Conclusion The success of machine-learning to identify an effective marker signature documents the power of this approach. The high diagnostic accuracy of discriminating PDAC from CP could have tremendous consequences for treatment success, once the result from still a limited number of liquid biopsy samples would be confirmed in a larger cohort of patients with suspected pancreatic cancer.
Institution pages aggregate content on ResearchGate related to an institution. The members listed on this page have self-identified as being affiliated with this institution. Publications listed on this page were identified by our algorithms as relating to this institution. This page was not created or approved by the institution. If you represent an institution and have questions about these pages or wish to report inaccurate content, you can contact us here.