• Home
  • Geert Molenberghs
Geert Molenberghs

Geert Molenberghs
Universiteit Hasselt and University of Leuven · I-BioStat

About

1,033
Publications
203,110
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
32,662
Citations
Additional affiliations
August 2007 - present
KU Leuven
Position
  • Professor, Head of I-BioStat
October 1993 - present
Hasselt University
Position
  • Professor, Head of I-BioStat

Publications

Publications (1,033)
Article
In the meta-analytic surrogate evaluation framework, the trial-level coefficient of determination R trial 2 quantifies the strength of the association between the expected causal treatment effects on the surrogate (S) and the true (T) endpoints. Burzykowski and Buyse supplemented this metric of surrogacy with the surrogate threshold effect (STE)...
Article
Full-text available
During most of their life, stars fuse hydrogen into helium in their cores. The mixing of chemical elements in the radiative envelope of stars with a convective core is able to replenish the core with extra fuel. If effective, such deep mixing allows stars to live longer and change their evolutionary path. Yet localized observations to constrain int...
Technical Report
Full-text available
This report describes the combined impact of different social distancing scenarios, the 501Y.V1 variant and the ongoing vaccination campaign in Belgium and illustrates the importance of epidemic control in the period up to August 1, 2021. • Changing social distancing behaviour as a result of lifting measures too soon might, in spite of the ongoing...
Article
Full-text available
Starting from historic reflections, the current SARS-CoV-2 induced COVID-19 pandemic is examined from various perspectives, in terms of what it implies for the implementation of non-pharmaceutical interventions, the modeling and monitoring of the epidemic, the development of early-warning systems, the study of mortality, prevalence estimation, diag...
Article
Although COVID-19 has been spreading throughout Belgium since February, 2020, its spatial dynamics in Belgium remain poorly understood, partly due to the limited testing of suspected cases during the epidemic’s early phase. We analyse data of COVID-19 symptoms, as self-reported in a weekly online survey, which is open to all Belgian citizens. We pr...
Article
Given the heterogeneous responses to therapy and the high cost of treatments, there is an increasing interest in identifying pretreatment predictors of therapeutic effect. Clearly, the success of such an endeavor will depend on the amount of information that the patient‐specific variables convey about the individual causal treatment effect on the r...
Article
The relationship between association and surrogacy has been the focus of much debate in the surrogate marker literature. Recently, the individual causal association (ICA) has been introduced as a metric of surrogacy in the causal inference framework, when both the surrogate and the true endpoint are normally distributed and when both are binary. Ea...
Article
Full-text available
Background: Immunosenescence biomarkers and peripheral blood parameters are evaluated separately as possible predictive markers of immunotherapy. Here, we illustrate the use of a causal inference model to identify predictive biomarkers of CIMAvaxEGF success in the treatment of Non-Small Cell Lung Cancer Patients. Methods: Data from a controlled...
Preprint
Full-text available
Objective. Scrutiny of COVID-19 mortality in Belgium over the period 8 March-9 May 2020 (Weeks 11-19), using number of deaths per million, infection fatality rates, and the relation between COVID-19 mortality and excess death rates. Data. Publicly available COVID-19 mortality (2020); overall mortality (2009-2020) data in Belgium and demographic dat...
Preprint
Full-text available
Although COVID-19 has been spreading throughout Belgium since February, 2020, its spatial dynamics in Belgium remain poorly understood, due to the limited testing of suspected cases. We analyse data of COVID-19 symptoms, as self-reported in a weekly online survey, which is open to all Belgian citizens. We predict symptoms' incidence using binomial...
Article
Full-text available
Since factor analysis is one of the most often used techniques in psychometrics, comparing or combining solutions from different factor analyses is often needed. Several measures to compare factors exist, one of the best known is Tucker's congruence coefficient, which is enjoying newly found popularity thanks to the recent work of Lorenzo-Seva and...
Article
The purpose of this paper is to contrast the Mantel–Haenszel estimator with an optimal estimator to better understand its specific nature, as well as some unique and interesting properties of the data setting for which it was developed. It is emphasized here that the Mantel–Haenszel estimator does not follow from optimality considerations, but neve...
Article
The draft ICH E9(R1) addendum stipulates that an estimator should align with its associated estimand and yield an estimate that facilitates reliable interpretations. The addendum further stipulates that assumptions should be justifiable and plausible, and that the extent of assumptions is an important consideration for whether an estimate will be r...
Article
This paper provides examples of defining estimands in real-world scenarios following ICH E9(R1) guidelines. Detailed discussions on choosing the estimands and estimators can be found in our companion papers. Three scenarios of increasing complexity are illustrated. The first example is a proof-of-concept trial in major depressive disorder where the...
Article
The National Research Council (NRC) Expert Panel Report on Prevention and Treatment of Missing Data in Clinical Trials highlighted the need for clearly defining objectives and estimands. That report sparked considerable discussion and literature on estimands and how to choose them. Importantly, consideration moved beyond missing data to include all...
Article
Full-text available
Introduction Currently, no treatment that delays with the progression of Friedreich ataxia is available. In the majority of patients Friedreich ataxia is caused by homozygous pathological expansion of GAA repeats in the first intron of the FXN gene. Nicotinamide acts as a histone deacetylase inhibitor. Dose escalation studies have shown, that short...
Article
Full-text available
Objectives We investigated the potential impact of reduced tobacco use scenarios on total life expectancy and health expectancies, i.e., healthy life years and unhealthy life years. Methods Data from the Belgian Health Interview Survey 2013 were used to estimate smoking and disability prevalence. Disability was based on the Global Activity Limitat...
Preprint
Background: Immunosenescence biomarkers and peripheral blood parameters are evaluated separately as possible predictive markers of immunotherapy. Here we illustrate the use of a causal inference model to identify predictive biomarkers of CIMAvaxEGF success in the treatment of Non–Small Cell Lung Cancer Patients. Methods: Data from a clinical trial...
Preprint
Full-text available
Background: Immunosenescence biomarkers and peripheral blood parameters are evaluated separately as possible predictive markers of immunotherapy. Here, we illustrate the use of a causal inference model to identify predictive biomarkers of CIMAvaxEGF success in the treatment of Non–Small Cell Lung Cancer Patients. Methods: Data from a controlled cli...
Preprint
Full-text available
Background: Immunosenescence biomarkers and peripheral blood parameters are evaluated separately as possible predictive markers of immunotherapy. Here, we illustrate the use of a causal inference model to identify predictive biomarkers of CIMAvaxEGF success in the treatment of Non–Small Cell Lung Cancer Patients. Methods: Data from a clinical trial...
Article
Full-text available
In Belgium, variations in thyroid cancer incidence were observed around the major nuclear sites. The present ecological study investigates whether there is an excess incidence of thyroid cancer among people living in the vicinity of the four nuclear sites at the smallest Belgian geographical level. Rate ratios were obtained from a Bayesian hierarch...
Article
When assessing surrogate endpoints in clinical studies under a causal-inference framework, a simulation-based sensitivity analysis is required, so as to sample the unidentifiable parameters across plausible values. To be precise, correlation matrices need to be sampled with only some of their entries identified from the data, known as the matrix co...
Article
Identification of genomic biomarkers is an important area of research in the context of drug discovery experiments. These experiments typically consist of several high dimensional datasets that contain information about a set of drugs (compounds) under development. This type of data structure introduces the challenge of multi-source data integratio...
Article
Biomarkers play a key role in the monitoring of disease progression. The time taken for an individual to reach a biomarker exceeding or lower than a meaningful threshold is often of interest. Due to the inherent variability of biomarkers, persistence criteria are sometimes included in the definitions of progression, such that only two consecutive m...
Presentation
Context: While factor/principal component analysis is a popular technique to deal with high number of variables, for example in health surveys, comparing or combining solutions from different factor analyses can be cumbersome even though combining factors is necessary in several situations. For example, when applying multiple imputation (to account...
Article
Clustered count data are commonly analysed by the generalized linear mixed model (GLMM). Here, the correlation due to clustering and some overdispersion is captured by the inclusion of cluster-specific normally distributed random effects. Often, the model does not capture the variability completely. Therefore, the GLMM can be extended by including...
Article
Emulators provide approximations to computationally expensive functions and are widely used in diverse domains, despite the ever increasing speed of computational devices. In this paper we establish a connection between two independently developed emulation methods: radial basis function networks and Gaussian process emulation. The methodological r...
Article
Full-text available
In 1981, the idea of a superwind that ends the life of cool giant stars was proposed. Extreme OH/IR-stars develop superwinds with the highest mass-loss rates known so far, up to a few 10^(-4) Msun/yr, informing our understanding of the maximum mass-loss rate achieved during the Asymptotic Giant Branch (AGB) phase. A condundrum arises whereby the ob...
Article
Full-text available
In the version of this Letter originally published, the caption of Fig. 2 incorrectly said J = 3–2; it should have said J = 2–1. This has now been corrected.
Article
Full-text available
Background Multi-mode data collection is widely used in surveys. Since several modes of data collection are successively applied in such design (e.g. self-administered questionnaire after face-to-face interview), partial nonresponse occurs if participants fail to complete all stages of the data collection. Although such nonresponse might seriously...
Article
This paper provides examples of defining estimands in real-world scenarios following ICH E9(R1) guidelines. Detailed discussions on choosing the estimands and estimators can be found in our companion papers. Three scenarios of increasing complexity are illustrated. The first example is a proof-of-concept trial in major depressive disorder where the...
Article
The draft ICH E9(R1) addendum stipulates that an estimator should align with its associated estimand and yield an estimate that facilitates reliable interpretations. The addendum further stipulates that assumptions should be justifiable and plausible, and that the extent of assumptions is an important consideration for whether an estimate will be r...
Article
The National Research Council (NRC) Expert Panel Report on Prevention and Treatment of Missing Data in Clinical Trials highlighted the need for clearly defining objectives and estimands. That report sparked considerable discussion and literature on estimands and how to choose them. Importantly, consideration moved beyond missing data to include all...
Article
Full-text available
In spite of medical and methodological advances, the identification of good surrogate endpoints has remained a challenging endeavour. This may, at least partially, be attributable to the fact that most researchers have only focused on univariate surrogates endpoints. In the present work, we argue in favour of using multivariate surrogates and intro...
Article
Full-text available
The asteroseismic modelling of period spacing patterns from gravito-inertial modes in stars with a convective core is a high-dimensional problem. We utilize the measured period spacing pattern of prograde dipole gravity modes (acquiring 0), in combination with the effective temperature (Teff) and surface gravity (log g) derived from spectroscopy, t...
Preprint
The asteroseismic modelling of period spacing patterns from gravito-inertial modes in stars with a convective core is a high-dimensional problem. We utilise the measured period spacing pattern of prograde dipole gravity modes (acquiring $\Pi_0$), in combination with the effective temperature ($T_{\rm eff}$) and surface gravity ($\log g$) derived fr...
Article
Full-text available
At the beginning of the 21st century, a new paradigm was introduced for the evaluation of surrogate endpoints based on meta-analysis. In this paradigm, the putative surrogate is assessed at two different levels, the so-called, trial and individual level. Trial level surrogacy is defined as the association between the expected causal treatment effec...
Article
This paper presents the rationale, genesis, and applications of Project Cornelia, an ongoing computational art history project developed by a cross-disciplinary team at the KU Leuven (University of Leuven). It shares practical perspectives acquired while conceptualizing and unfolding the project and discusses successes as well as challenges and set...
Article
Surrogate endpoints need to be statistically evaluated before they can be used as substitutes of true endpoints in clinical studies. However, even though several evaluation methods have been introduced over the last decades, the identification of good surrogate endpoints remains practically and conceptually challenging. In the present work, the que...
Article
Full-text available
Background Non-suicidal self-injury (NSSI) is defined as the repetitive, direct, and deliberate destruction of one’s body tissue without an intention to die. Existing cross-sectional research indicates that the association between maternal/peer attachment and NSSI is mediated by identity synthesis and confusion. However, longitudinal confirmation o...
Article
Full-text available
The simultaneous presence of variability due to both pulsations and binarity is no rare phenomenon. Unfortunately, the complexities of dealing with even one of these sources of variability individually means that the other signal is often treated as a nuisance and discarded. However, both types of variability offermeans to probe fundamental stellar...
Article
The increase in life expectancy followed by the burden of chronic diseases contributes to disability at older ages. The estimation of how much chronic conditions contribute to disability can be useful to develop public health strategies to reduce the burden. This paper introduces the R package addhaz, which is based on the attribution method (Nusse...
Article
Full-text available
The asteroseismic modelling of period spacing patterns from gravito-inertial modes in stars with a convective core is a high-dimensional problem. We utilize the measured period spacing pattern of prograde dipole gravity modes (acquiring Pi(0)), in combination with the effective temperature (Teff) and surface gravity (log g) derived from spectroscop...
Article
Data Monitoring Committees (DMCs) are an integral part of clinical drug development. Their use has evolved along with changing study designs and regulatory expectations, which has associated statistical and ethical implications. Although there is guidance from the different regulatory agencies, there are opportunities to bring more consistency to a...
Article
Full-text available
The individual causal association (ICA) has recently been introduced as a metric of surrogacy in a causal‐inference framework. The ICA is defined on the unit interval and quantifies the association between the individual causal effect on the surrogate (ΔS) and true (ΔT) endpoint. In addition, the ICA offers a general assessment of the surrogate pre...
Preprint
Full-text available
We present a key example from sequential analysis, which illustrates that conditional bias reduction can cause infinite mean absolute error.
Article
We consider multiple imputation as a procedure iterating over a set of imputed datasets. Based on an appropriate stopping rule the number of imputed datasets is determined. Simulations and real-data analyses indicate that the sufficient number of imputed datasets may in some cases be substantially larger than the very small numbers that are usually...
Article
Full-text available
Background: IDeAl (Integrated designs and analysis of small population clinical trials) is an EU funded project developing new statistical design and analysis methodologies for clinical trials in small population groups. Here we provide an overview of IDeAl findings and give recommendations to applied researchers. Method: The description of the...
Article
Missing data is almost inevitable in correlated-data studies. For non-Gaussian outcomes with moderate to large sequences, direct-likelihood methods can involve complex, hard-to-manipulate likelihoods. Popular alternative approaches, like generalized estimating equations, that are frequently used to circumvent the computational complexity of full li...
Article
Estimating complex linear mixed models using an iterative full maximum likelihood estimator can be cumbersome in some cases. With small and unbalanced datasets, convergence problems are common. Also, for large datasets, iterative procedures can be computationally prohibitive. To overcome these computational issues, an unbiased two-stage closed-form...
Preprint
Full-text available
The simultaneous presence of variability due to both pulsations and binarity is no rare phenomenon. Unfortunately, the complexities of dealing with even one of these sources of variability individually means that the other signal is often treated as a nuisance and discarded. However, both types of variability offer means to probe fundamental stella...
Article
Missing data methods, maximum likelihood estimation (MLE) and multiple imputation (MI), for longitudinal questionnaire data were investigated via simulation. Predictive mean matching (PMM) was applied at both item and scale levels, logistic regression at item level and multivariate normal imputation at scale level. We investigated a hybrid approach...
Article
Full-text available
We refine the classical Lindeberg-Feller central limit theorem by obtaining asymptotic bounds on the Kolmogorov distance, the Wasserstein distance, and the parametrized Prokhorov distances in terms of a Lindeberg index. We thus obtain more general approximate central limit theorems, which roughly state that the row-wise sums of a triangular array a...
Article
Purpose Vascular factors have been suggested to influence the development and progression of glaucoma. They are thought to be especially relevant for normal‐tension glaucoma (NTG) patients. We aim to investigate which vascular factors, including advanced vascular examinations, better describe patients with NTG comparing to those with primary open‐a...
Article
A Weibull-model-based approach is examined to handle under- and overdispersed discrete data in a hierarchical framework. This methodology was first introduced by Nakagawa and Osaki (1975, IEEE Transactions on Reliability, 24, 300–301), and later examined for under- and overdispersion by Klakattawi et al. (2018, Entropy, 20, 142) in the univariate c...
Article
The maximum entropy principle offers a constructive criterion for setting up probability distributions on the basis of partial knowledge. In the present work, the principle is applied to tackle an important problem in the surrogate marker field, namely, the evaluation of a binary outcome as a putative surrogate for a binary true endpoint within a c...
Article
Full-text available
We propose a methodological framework to perform forward asteroseismic modeling of stars with a convective core, based on gravity-mode oscillations. These probe the near-core region in the deep stellar interior. The modeling relies on a set of observed high-precision oscillation frequencies of low-degree coherent gravity modes with long lifetimes a...
Article
Full-text available
The emergence of multidrug resistant-tuberculosis (MDR-TB), defined as Mycobacterium tuberculosis strains with in vitro resistance to at least isoniazid and rifampicin, has necessitated evaluation and validation of appropriate surrogate endpoints for treatment response in drug trials for MDR-TB. The trial that has demonstrated efficacy of bedaquili...
Data
Institutional review boards. (DOCX)
Data
A) Relationship between S24 (on the basis of AFB smear conversion) and T for BDQ; B) Relationship between S24 (on the basis of AFB smear conversion) and T for Placebo control. (DOCX)
Data
A) Relationship between S24 (on the basis of culture conversion) and T for BDQ: imputed values; B) Relationship between S24 (on the basis of culture conversion) and T for Placebo control: imputed values. (DOCX)
Article
Gaussian process (GP) emulation is a relatively recent statistical technique that provides a fast-running approximation to a complex computer model, given training data generated by the considered model. Despite its sound theoretical foundation, GP emulation falls short in practical applications where the training dataset is very large, due to nume...