Ben Glocker

Ben Glocker
Imperial College London | Imperial · Department of Computing

PhD

About

302
Publications
74,551
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
17,467
Citations
Introduction
I am a Lecturer in Medical Image Computing at Imperial College London and a member of the Biomedical Image Analysis Group in the section of Visual Information Processing at the Department of Computing. My research is in the area of biomedical image analysis and computer vision with a focus on semantic understanding of images using machine learning.
Skills and Expertise

Publications

Publications (302)
Preprint
Full-text available
The field of automatic biomedical image analysis crucially depends on robust and meaningful performance metrics for algorithm validation. Current metric usage, however, is often ill-informed and does not reflect the underlying domain interest. Here, we present a comprehensive framework that guides researchers towards choosing performance metrics in...
Article
Computed tomography (CT) brain imaging is routinely used to support clinical decision-making in patients with traumatic brain injury (TBI). Only 7% of scans, however, demonstrate evidence of TBI. The other 93% of scans contribute a significant cost to the healthcare system and a radiation risk to patients. There may be better strategies to identify...
Preprint
Failure detection in automated image classification is a critical safeguard for clinical deployment. Detected failure cases can be referred to human assessment, ensuring patient safety in computer-aided clinical decision making. Despite its paramount importance, there is insufficient evidence about the ability of state-of-the-art confidence scoring...
Preprint
Variational autoencoders (VAEs) are a popular class of deep generative models with many variants and a wide range of applications. Improvements upon the standard VAE mostly focus on the modelling of the posterior distribution over the latent space and the properties of the neural network decoder. In contrast, improving the model for the observation...
Article
Full-text available
Complex metabolic disruption is a crucial aspect of the pathophysiology of traumatic brain injury (TBI). Associations between this and systemic metabolism and their potential prognostic value are poorly understood. Here, we aimed to describe the serum metabolome (including lipidome) associated with acute TBI within 24 h post-injury, and its relatio...
Article
Background Despite being well established, acute surgery in traumatic acute subdural haematoma is based on low-grade evidence. We aimed to compare the effectiveness of a strategy preferring acute surgical evacuation with one preferring initial conservative treatment in acute subdural haematoma. Methods We did a prospective, observational, comparat...
Article
There is substantial interest in the potential for traumatic brain injury to result in progressive neurological deterioration. While blood biomarkers such as glial fibrillary acid protein and neurofilament light have been widely explored in characterising acute traumatic brain injury, their use in the chronic phase is limited. Given increasing evid...
Article
Artificial intelligence systems for health care, like any other medical device, have the potential to fail. However, specific qualities of artificial intelligence systems, such as the tendency to learn spurious correlates in training data, poor generalisability to new deployment settings, and a paucity of reliable explainability mechanisms, mean th...
Article
Full-text available
Imperfections in data annotation, known as label noise, are detrimental to the training of machine learning models and have a confounding effect on the assessment of model performance. Nevertheless, employing experts to remove label noise by fully re-annotating large datasets is infeasible in resource-constrained settings, such as healthcare. This...
Article
Background Frailty is known to be associated with poorer outcomes in individuals admitted to hospital for medical conditions requiring intensive care. However, little evidence is available for the effect of frailty on patients’ outcomes after traumatic brain injury. Many frailty indices have been validated for clinical practice and show good perfor...
Article
Deep learning models for semantic segmentation are able to learn powerful representations for pixel-wise predictions, but are sensitive to noise at test time and may lead to implausible topologies. Image registration models on the other hand are able to warp known topologies to target images as a means of segmentation, but typically require large a...
Preprint
Deep learning models have shown great potential for image-based diagnosis assisting clinical decision making. At the same time, an increasing number of reports raise concerns about the potential risk that machine learning could amplify existing health disparities due to human biases that are embedded in the training data. It is of great importance...
Preprint
Domain Adaptation (DA) has recently raised strong interests in the medical imaging community. While a large variety of DA techniques has been proposed for image segmentation, most of these techniques have been validated either on private datasets or on small publicly available datasets. Moreover, these datasets mostly addressed single-class problem...
Article
Full-text available
Background We aimed to understand the relationship between serum biomarker concentration and lesion type and volume found on computed tomography (CT) following all severities of TBI. Methods Concentrations of six serum biomarkers (GFAP, NFL, NSE, S100B, t-tau and UCH-L1) were measured in samples obtained <24 hours post-injury from 2869 patients wi...
Article
Full-text available
Background Trauma-induced coagulopathy in traumatic brain injury (TBI) remains associated with high rates of complications, unfavorable outcomes, and mortality. The underlying mechanisms are largely unknown. Embedded in the prospective multinational Collaborative European Neurotrauma Effectiveness Research in Traumatic Brain Injury (CENTER-TBI) stu...
Article
Full-text available
We propose a new framework for estimating neuroimaging-derived “brain-age” at a local level within the brain, using deep learning. The local approach, contrary to existing global methods, provides spatial information on anatomical patterns of brain ageing. We trained a U-Net model using brain MRI scans from n = 3,463 healthy people (aged 18–90 year...
Article
Full-text available
Background In traumatic brain injury (TBI), large between-center differences in treatment and outcome for patients managed in the intensive care unit (ICU) have been shown. The aim of this study is to explore if European neurotrauma centers can be clustered, based on their treatment preference in different domains of TBI care in the ICU. Methods P...
Article
Full-text available
Data privacy mechanisms are essential for rapidly scaling medical training databases to capture the heterogeneity of patient data distributions toward robust and generalizable machine learning systems. In the current COVID-19 pandemic, a major focus of artificial intelligence (AI) is interpreting chest CT, which can be readily used in the assessmen...
Article
Full-text available
Background Prehospital care for patients with traumatic brain injury (TBI) varies with some emergency medical systems recommending direct transport of patients with moderate to severe TBI to hospitals with specialist neurotrauma care (SNCs). The aim of this study is to assess variation in levels of early secondary referral within European SNCs and...
Article
Introduction Neurocognitive problems associated with posttraumatic stress disorder (PTSD) can interact with impairment resulting from traumatic brain injury (TBI). Research question We aimed to identify neurocognitive problems associated with probable PTSD following TBI in a civilian sample. Material and methods The study is part of the CENTER-TB...
Preprint
It has been rightfully emphasized that the use of AI for clinical decision making could amplify health disparities. A machine learning model may pick up undesirable correlations, for example, between a patient's racial identity and clinical outcome. Such correlations are often present in (historical) data used for model development. There has been...
Preprint
Full-text available
We develop a new Bayesian model for non-rigid registration of three-dimensional medical images, with a focus on uncertainty quantification. Probabilistic registration of large images with calibrated uncertainty estimates is difficult for both computational and modelling reasons. To address the computational issues, we explore connections between th...
Chapter
Automated disease classification could significantly improve the accuracy of prostate cancer diagnosis on MRI, which is a difficult task even for trained experts. Convolutional neural networks (CNNs) have shown some promising results for disease classification on multi-parametric MRI. However, CNNs struggle to extract robust global features about t...
Preprint
Full-text available
MC Dropout is a mainstream "free lunch" method in medical imaging for approximate Bayesian computations (ABC). Its appeal is to solve out-of-the-box the daunting task of ABC and uncertainty quantification in Neural Networks (NNs); to fall within the variational inference (VI) framework; and to propose a highly multimodal, faithful predictive poster...
Article
Vertebral labelling and segmentation are two fundamental tasks in an automated spine processing pipeline. Reliable and accurate processing of spine images is expected to benefit clinical decision support systems for diagnosis, surgery planning, and population-based analysis of spine and bone health. However, designing automated algorithms for spine...
Article
Background In patients with severe brain injury, withdrawal of life-sustaining measures (WLSM) is common in intensive care units (ICU). WLSM constitutes a dilemma: instituting WLSM too early could result in death despite the possibility of an acceptable functional outcome, whereas delaying WLSM could unnecessarily burden patients, families, clinici...
Preprint
Full-text available
In recent years, the research landscape of machine learning in medical imaging has changed drastically from supervised to semi-, weakly- or unsupervised methods. This is mainly due to the fact that ground-truth labels are time-consuming and expensive to obtain manually. Generating labels from patient metadata might be feasible but it suffers from u...
Chapter
Using publicly available data to determine the performance of methodological contributions is important as it facilitates reproducibility and allows scrutiny of the published results. In lung nodule classification, for example, many works report results on the publicly available LIDC dataset. In theory, this should allow a direct comparison of the...
Chapter
In recent years, the research landscape of machine learning in medical imaging has changed drastically from supervised to semi-, weakly- or unsupervised methods. This is mainly due to the fact that ground-truth labels are time-consuming and expensive to obtain manually. Generating labels from patient metadata might be feasible but it suffers from u...
Chapter
Fetal ultrasound screening during pregnancy plays a vital role in the early detection of fetal malformations which have potential long-term health impacts. The level of skill required to diagnose such malformations from live ultrasound during examination is high and resources for screening are often limited. We present an interpretable, atlas-learn...
Chapter
Whole body magnetic resonance imaging (WB-MRI) is the recommended modality for diagnosis of multiple myeloma (MM). WB-MRI is used to detect sites of disease across the entire skeletal system, but it requires significant expertise and is time-consuming to report due to the great number of images. To aid radiological reading, we propose an auxiliary...
Preprint
Full-text available
Despite impressive accuracy, deep neural networks are often miscalibrated and tend to overly confident predictions. Recent techniques like temperature scaling (TS) and label smoothing (LS) show effectiveness in obtaining a well-calibrated model by smoothing logits and hard labels with scalar factors, respectively. However, the use of uniform TS or...
Preprint
Full-text available
Imperfections in data annotation, known as label noise, are detrimental to the training of machine learning models and have an often-overlooked confounding effect on the assessment of model performance. Nevertheless, employing experts to remove label noise by fully re-annotating large datasets is infeasible in resource-constrained settings, such as...
Article
Despite the rapid increase of data available to train machine-learning algorithms in many domains, several applications suffer from a paucity of representative and diverse data. The medical and financial sectors are, for example, constrained by legal, ethical, regulatory and privacy concerns preventing data sharing between institutions. Collaborati...
Chapter
Convolutional Neural Networks (CNNs) are widely used for image classification in a variety of fields, including medical imaging. While most studies deploy cross-entropy as the loss function in such tasks, a growing number of approaches have turned to a family of contrastive learning-based losses. Even though performance metrics such as accuracy, se...
Chapter
Scarcity of high quality annotated images remains a limiting factor for training accurate image segmentation models. While more and more annotated datasets become publicly available, the number of samples in each individual database is often small. Combining different databases to create larger amounts of training data is appealing yet challenging...
Chapter
Semi-supervised learning (SSL) uses unlabeled data during training to learn better models. Previous studies on SSL for medical image segmentation focused mostly on improving model generalization to unseen data. In some applications, however, our primary interest is not generalization but to obtain optimal predictions on a specific unlabeled databas...
Preprint
Full-text available
Using publicly available data to determine the performance of methodological contributions is important as it facilitates reproducibility and allows scrutiny of the published results. In lung nodule classification, for example, many works report results on the publicly available LIDC dataset. In theory, this should allow a direct comparison of the...
Preprint
Full-text available
Convolutional Neural Networks (CNNs) are widely used for image classification in a variety of fields, including medical imaging. While most studies deploy cross-entropy as the loss function in such tasks, a growing number of approaches have turned to a family of contrastive learning-based losses. Even though performance metrics such as accuracy, se...
Article
Full-text available
Background In patients with severe brain injury, withdrawal of life-sustaining measures (WLSM) is common in intensive care units (ICU). WLSM constitutes a dilemma: instituting WLSM too early could result in death despite the possibility of an acceptable functional outcome, whereas delaying WLSM could unnecessarily burden patients, families, clinici...
Article
Full-text available
Background Prehospital care for patients with traumatic brain injury (TBI) varies with some emergency medical systems recommending direct transport of patients with moderate to severe TBI to hospitals with specialist neurotrauma care (SNCs). The aim of this study is to assess variation in levels of early secondary referral within European SNCs and...
Article
Unsupervised abnormality detection is an appealing approach to identify patterns that are not present in training data without specific annotations for such patterns. In the medical imaging field, methods taking this approach have been proposed to detect lesions. The appeal of this approach stems from the fact that it does not require lesion-specif...
Preprint
Full-text available
Datasets are rarely a realistic approximation of the target population. Say, prevalence is misrepresented, image quality is above clinical standards, etc. This mismatch is known as sampling bias. Sampling biases are a major hindrance for machine learning models. They cause significant gaps between model performance in the lab and in the real world....
Preprint
Despite technological and medical advances, the detection, interpretation, and treatment of cancer based on imaging data continue to pose significant challenges. These include high inter-observer variability, difficulty of small-sized lesion detection, nodule interpretation and malignancy determination, inter- and intra-tumour heterogeneity, class...
Preprint
Semi-supervised learning (SSL) uses unlabeled data during training to learn better models. Previous studies on SSL for medical image segmentation focused mostly on improving model generalization to unseen data. In some applications, however, our primary interest is not generalization but to obtain optimal predictions on a specific unlabeled databas...
Preprint
Whole body magnetic resonance imaging (WB-MRI) is the recommended modality for diagnosis of multiple myeloma (MM). WB-MRI is used to detect sites of disease across the entire skeletal system, but it requires significant expertise and is time-consuming to report due to the great number of images. To aid radiological reading, we propose an auxiliary...
Preprint
Full-text available
Scarcity of high quality annotated images remains a limiting factor for training accurate image segmentation models. While more and more annotated datasets become publicly available, the number of samples in each individual database is often small. Combining different databases to create larger amounts of training data is appealing yet challenging...
Preprint
Full-text available
Fetal ultrasound screening during pregnancy plays a vital role in the early detection of fetal malformations which have potential long-term health impacts. The level of skill required to diagnose such malformations from live ultrasound during examination is high and resources for screening are often limited. We present an interpretable, atlas-learn...
Preprint
Image classification models deployed in the real world may receive inputs outside the intended data distribution. For critical applications such as clinical decision making, it is important that a model can detect such out-of-distribution (OOD) inputs and express its uncertainty. In this work, we assess the capability of various state-of-the-art ap...
Chapter
The task of image segmentation is inherently noisy due to ambiguities regarding the exact location of boundaries between anatomical structures. We argue that this information can be extracted from the expert annotations at no extra cost, and when integrated into state-of-the-art neural networks, it can lead to improved calibration between soft prob...
Chapter
We propose a parameter efficient Bayesian layer for hierarchical convolutional Gaussian Processes that incorporates Gaussian Processes operating in Wasserstein-2 space to reliably propagate uncertainty. This directly replaces convolving Gaussian Processes with a distance-preserving affine operator on distributions. Our experiments on brain tissue-s...
Article
The aim of this paper is to provide a comprehensive overview of the MICCAI 2020 AutoImplant Challenge1. The approaches and publications submitted and accepted within the challenge will be summarized and reported, highlighting common algorithmic trends and algorithmic diversity. Furthermore, the evaluation results will be presented, compared and dis...
Preprint
Full-text available
We propose a parameter efficient Bayesian layer for hierarchical convolutional Gaussian Processes that incorporates Gaussian Processes operating in Wasserstein-2 space to reliably propagate uncertainty. This directly replaces convolving Gaussian Processes with a distance-preserving affine operator on distributions. Our experiments on brain tissue-s...
Preprint
Full-text available
While the importance of automatic image analysis is increasing at an enormous pace, recent meta-research revealed major flaws with respect to algorithm validation. Specifically, performance metrics are key for objective, transparent and comparative performance assessment, but relatively little attention has been given to the practical pitfalls when...
Preprint
The task of image segmentation is inherently noisy due to ambiguities regarding the exact location of boundaries between anatomical structures. We argue that this information can be extracted from the expert annotations at no extra cost, and when integrated into state-of-the-art neural networks, it can lead to improved calibration between soft prob...
Preprint
Class imbalance poses a challenge for developing unbiased, accurate predictive models. In particular, in image segmentation neural networks may overfit to the foreground samples from small structures, which are often heavily under-represented in the training set, leading to poor generalization. In this study, we provide new insights on the problem...
Article
Background Fatigue is one of the most commonly reported subjective symptoms following traumatic brain injury (TBI). The aims were to assess frequency of fatigue over the first 6 months after TBI, and examine whether fatigue changes could be predicted by demographic characteristics, injury severity and comorbidities. Methods Patients with acute TBI...
Article
We propose a parameter efficient Bayesian layer for hierarchical convolutional Gaussian Processes that incorporates Gaussian Processes operating in Wasserstein-2 space to reliably propagate uncertainty. This directly replaces convolving Gaussian Processes with a distance-preserving affine operator on distributions. Our experiments on brain tissue-s...
Article
Class imbalance poses a challenge for developing unbiased, accurate predictive models. In particular, in image segmentation neural networks may overfit to the foreground samples from small structures, which are often heavily under-represented in the training set, leading to poor generalization. In this study, we provide new insights on the problem...
Preprint
Deep learning models for semantic segmentation are able to learn powerful representations for pixel-wise predictions, but are sensitive to noise at test time and do not guarantee a plausible topology. Image registration models on the other hand are able to warp known topologies to target images as a means of segmentation, but typically require larg...
Chapter
Cranial implant design is a challenging task, whose accuracy is crucial in the context of cranioplasty procedures. This task is usually performed manually by experts using computer-assisted design software. In this work, we propose and evaluate alternative automatic deep learning models for cranial implant reconstruction from CT images. The models...
Article
Full-text available
Causal reasoning can shed new light on the major challenges in machine learning for medical imaging: scarcity of high-quality annotated data and mismatch between the development dataset and the target environment. A causal perspective on these issues allows decisions about data collection, annotation, preprocessing, and learning strategies to be ma...
Article
Full-text available
Importance Personalized radiotherapy planning depends on high-quality delineation of target tumors and surrounding organs at risk (OARs). This process puts additional time burdens on oncologists and introduces variability among both experts and institutions. Objective To explore clinically acceptable autocontouring solutions that can be integrated...
Preprint
Full-text available
We investigate the usefulness of Wasserstein-2 kernels in the context of hierarchical Gaussian Processes. Stemming from an observation that stacking Gaussian Processes severely diminishes the model's ability to detect outliers, which when combined with non-zero mean functions, further extrapolates low variance to regions with low training data dens...
Article
The role of extra-cranial injury burden on cerebrovascular response in traumatic brain injury (TBI) is poorly documented. This study preliminarily assesses the association between admission features of extra-cranial injury burden on cerebrovascular reactivity. Using the CENTER-TBI HR ICU sub-study cohort, we evaluated those patients with both archi...
Article
Purpose: Deep learning (DL) algorithms have shown promising results for brain tumor segmentation in MRI. However, validation is required prior to routine clinical use. We report the first randomized and blinded comparison of DL and trained technician segmentations. Approach: We compiled