April 2025
·
75 Reads
The Lancet Respiratory Medicine
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
April 2025
·
75 Reads
The Lancet Respiratory Medicine
March 2025
·
41 Reads
Statistics in Medicine
Targeted maximum likelihood estimation (TMLE) is an increasingly popular framework for the estimation of causal effects. It requires modeling both the exposure and outcome but is doubly robust in the sense that it is valid if at least one of these models is correctly specified. In addition, TMLE allows for flexible modeling of both the exposure and outcome with machine learning methods. This provides better control for measured confounders since the model specification automatically adapts to the data, instead of needing to be specified by the analyst a priori . Despite these methodological advantages, TMLE remains less popular than alternatives in part because of its less accessible theory and implementation. While some tutorials have been proposed, none address the case of a time‐to‐event outcome. This tutorial provides a detailed step‐by‐step explanation of the implementation of TMLE for estimating the effect of a point binary or multilevel exposure on a time‐to‐event outcome, modeled as counterfactual survival curves and causal hazard ratios. The tutorial also provides guidelines on how best to use TMLE in practice, including aspects related to study design, choice of covariates, controlling biases and use of machine learning. R‐code is provided to illustrate each step using simulated data ( https://github.com/detal9/SurvTMLE ). To facilitate implementation, a general R function implementing TMLE with options to use machine learning is also provided. The method is illustrated in a real‐data analysis concerning the effectiveness of statins for the prevention of a first cardiovascular disease among older adults in Québec, Canada, between 2013 and 2018.
February 2025
·
28 Reads
Statistics in Medicine
The test‐negative design (TND), which is routinely used for monitoring seasonal flu vaccine effectiveness (VE), has recently become integral to COVID‐19 vaccine surveillance, notably in Québec, Canada. Some studies have addressed the identifiability and estimation of causal parameters under the TND, but efficiency bounds for nonparametric estimators of the target parameter under the unconfoundedness assumption have not yet been investigated. Motivated by the goal of improving adjustment for measured confounders when estimating COVID‐19 VE among community‐dwelling people aged years in Québec, we propose a one‐step doubly robust and locally efficient estimator called TNDDR (TND doubly robust), which utilizes cross‐fitting (sample splitting) and can incorporate machine learning techniques to estimate the nuisance functions and thus improve control for measured confounders. We derive the efficient influence function (EIF) for the marginal expectation of the outcome under a vaccination intervention, explore the von Mises expansion, and establish the conditions for ‐consistency, asymptotic normality, and double robustness of TNDDR. The proposed estimator is supported by both theoretical and empirical justifications.
February 2025
·
4 Reads
The Test-Negative Design (TND), which involves recruiting care-seeking individuals who meet predefined clinical case criteria, offers valid statistical inference for Vaccine Effectiveness (VE) using data collected through passive surveillance, making it cost-efficient and timely. Infectious disease epidemiology often involves interference, where the treatment and/or outcome of one individual can affect the outcomes of others, rendering standard causal estimands ill-defined; ignoring such interference can bias VE evaluation and lead to ineffective vaccination policies. This article addresses the estimation of causal estimands for VE in the presence of partial interference using TND samples. Partial interference means that the vaccination of units within the same group/cluster may influence the outcomes of other members of the cluster. We define the population direct, spillover, total, and overall effects using the geometric risk ratio, which are identifiable under TND sampling. We investigate various stochastic policies for vaccine allocation in a counterfactual scenario, and identify policy-relevant VE causal estimands. We propose inverse-probability weighted (IPW) estimators for estimating the policy-relevant VE causal estimands with partial interference under the TND, and explore the statistical properties of these estimators.
January 2025
·
5 Reads
Understanding treatment effect heterogeneity is important for decision making in medical and clinical practices, or handling various engineering and marketing challenges. When dealing with high-dimensional covariates or when the effect modifiers are not predefined and need to be discovered, data-adaptive selection approaches become essential. However, with data-driven model selection, the quantification of statistical uncertainty is complicated by post-selection inference due to difficulties in approximating the sampling distribution of the target estimator. Data-driven model selection tends to favor models with strong effect modifiers with an associated cost of inflated type I errors. Although several frameworks and methods for valid statistical inference have been proposed for ordinary least squares regression following data-driven model selection, fewer options exist for valid inference for effect modifier discovery in causal modeling contexts. In this article, we extend two different methods to develop valid inference for penalized G-estimation that investigates effect modification of proximal treatment effects within the structural nested mean model framework. We show the asymptotic validity of the proposed methods. Using extensive simulation studies, we evaluate and compare the finite sample performance of the proposed methods and the naive inference based on a sandwich variance estimator. Our work is motivated by the study of hemodiafiltration for treating patients with end-stage renal disease at the Centre Hospitalier de l'Universit\'e de Montr\'eal. We apply these methods to draw inference about the effect heterogeneity of dialysis facility on the repeated session-specific hemodiafiltration outcomes.
January 2025
·
2 Reads
·
2 Citations
Biometrics
Effect modification occurs when the impact of the treatment on an outcome varies based on the levels of other covariates known as effect modifiers. Modeling these effect differences is important for etiological goals and for purposes of optimizing treatment. Structural nested mean models (SNMMs) are useful causal models for estimating the potentially heterogeneous effect of a time-varying exposure on the mean of an outcome in the presence of time-varying confounding. A data-adaptive selection approach is necessary if the effect modifiers are unknown a priori and need to be identified. Although variable selection techniques are available for estimating the conditional average treatment effects using marginal structural models or for developing optimal dynamic treatment regimens, all of these methods consider a single end-of-follow-up outcome. In the context of an SNMM for repeated outcomes, we propose a doubly robust penalized G-estimator for the causal effect of a time-varying exposure with a simultaneous selection of effect modifiers and prove the oracle property of our estimator. We conduct a simulation study for the evaluation of its performance in finite samples and verification of its double-robustness property. Our work is motivated by the study of hemodiafiltration for treating patients with end-stage renal disease at the Centre Hospitalier de l’Université de Montréal. We apply the proposed method to investigate the effect heterogeneity of dialysis facility on the repeated session-specific hemodiafiltration outcomes.
December 2024
·
20 Reads
Objective: This study sought to compare the drop in predictive performance over time according to the modeling approach (regression versus machine learning) used to build a kidney transplant failure prediction model with a time-to-event outcome. Study Design and Setting: The Kidney Transplant Failure Score (KTFS) was used as a benchmark. We reused the data from which it was developed (DIVAT cohort, n=2,169) to build another prediction algorithm using a survival super learner combining (semi-)parametric and non-parametric methods. Performance in DIVAT was estimated for the two prediction models using internal validation. Then, the drop in predictive performance was evaluated in the same geographical population approximately ten years later (EKiTE cohort, n=2,329). Results: In DIVAT, the super learner achieved better discrimination than the KTFS, with a tAUROC of 0.83 (0.79-0.87) compared to 0.76 (0.70-0.82). While the discrimination remained stable for the KTFS, it was not the case for the super learner, with a drop to 0.80 (0.76-0.83). Regarding calibration, the survival SL overestimated graft survival at development, while the KTFS underestimated graft survival ten years later. Brier score values were similar regardless of the approach and the timing. Conclusion: The more flexible SL provided superior discrimination on the population used to fit it compared to a Cox model and similar discrimination when applied to a future dataset of the same population. Both methods are subject to calibration drift over time. However, weak calibration on the population used to develop the prediction model was correct only for the Cox model, and recalibration should be considered in the future to correct the calibration drift.
December 2024
·
4 Reads
Sequential positivity is often a necessary assumption for drawing causal inferences, such as through marginal structural modeling. Unfortunately, verification of this assumption can be challenging because it usually relies on multiple parametric propensity score models, unlikely all correctly specified. Therefore, we propose a new algorithm, called "sequential Positivity Regression Tree" (sPoRT), to check this assumption with greater ease under either static or dynamic treatment strategies. This algorithm also identifies the subgroups found to be violating this assumption, allowing for insights about the nature of the violations and potential solutions. We first present different versions of sPoRT based on either stratifying or pooling over time. Finally, we illustrate its use in a real-life application of HIV-positive children in Southern Africa with and without pooling over time. An R notebook showing how to use sPoRT is available at github.com/ArthurChatton/sPoRT-notebook.
December 2024
·
27 Reads
The test-negative design (TND), which is routinely used for monitoring seasonal flu vaccine effectiveness (VE), has recently become integral to COVID-19 vaccine surveillance, notably in Quebec, Canada. Some studies have addressed the identifiability and estimation of causal parameters under the TND, but efficiency bounds for nonparametric estimators of the target parameter under the unconfoundedness assumption have not yet been investigated. Motivated by the goal of improving adjustment for measured confounders when estimating COVID-19 VE among community-dwelling people aged years in Quebec, we propose a one-step doubly robust and locally efficient estimator called TNDDR (TND doubly robust), which utilizes cross-fitting (sample splitting) and can incorporate machine learning techniques to estimate the nuisance functions and thus improve control for measured confounders. We derive the efficient influence function (EIF) for the marginal expectation of the outcome under a vaccination intervention, explore the von Mises expansion, and establish the conditions for consistency, asymptotic normality and double robustness of TNDDR. The proposed estimator is supported by both theoretical and empirical justifications.
November 2024
·
4 Reads
·
1 Citation
Journal of the Royal Statistical Society Series C Applied Statistics
Obtaining continuously updated predictions is a major challenge for personalized medicine. Leveraging combinations of parametric regressions and machine learning algorithms, the personalized online super learner (POSL) can achieve such dynamic and personalized predictions. We adapt POSL to predict a repeated continuous outcome dynamically and propose a new way to validate such personalized or dynamic prediction models. We illustrate its performance by predicting the convection volume of patients undergoing hemodiafiltration. POSL outperformed its candidate learners with respect to median absolute error, calibration-in-the-large, discrimination, and net benefit. We finally discuss the choices and challenges underlying the use of POSL.
... This applies to cases where the users have strong a priori knowledge of which variables are the effect modifiers and are interested in certain of them. When the effect modifiers are unknown, one can use data-driven methods to explore the effect modifier and proceed with the analysis [41][42][43] ; or one can estimate the conditional average treatment effects (conditional on all possible covariates), set subgroups ( ) by individuals who have similar conditional average treatment effects, and proceed with the analysis. 3,44 Furthermore, condition A4 is untestable and may be too strong in some scenarios. ...
January 2025
Biometrics
... It has been shown that the resulting prediction is asymptotically at least as accurate as the best candidate learners, 9 and was successfully applied in other domains. 10, 11 However, it has seldom been used for time-to-event outcomes, 12,13 although it can handle such outcomes with right-censoring. 14,15 Unfortunately, CPM performance comparisons focus almost solely on validity at the time of development. ...
November 2024
Journal of the Royal Statistical Society Series C Applied Statistics
... While TND is convenient and has recently become integral to COVID-19 vaccine surveillance, it is prone to several potential biases, including confounding and collider bias [6,12,13]. The identification and estimation of statistical and causal estimands related to VE under the TND have been investigated in past work. ...
August 2024
American Journal of Epidemiology
... This limits data usage, excluding individuals who experienced the outcome during the interval and overlooking potential changes in trajectories during the follow-up period. To address these limitations and make better use of the data, we introduced a combination of history-restricted MSMs (HRMSMs) and LCGA [5]. LCGA-HRMSM can be seen as a repeated application of LCGA-MSM across multiple time intervals, allowing for the consideration of time-dependent outcomes. ...
August 2024
The International Journal of Biostatistics
... Recently, considerable research has been conducted on time-dependent covariate models that overcome the limitations of existing time-fixed covariate-based models and effectively reflect the dynamic characteristics of data [13][14][15] . Unlike models that reflect only a specific point in time, time-dependent covariate models integrate longitudinal data and include it in the analysis, allowing the dynamic characteristics of the data to be reflected in the analysis. ...
May 2024
Statistics in Medicine
... De forma similar, en otra publicación posterior, se establece que estableció que, no es posible concluir sobre la efectividad de las estatinas en la prevención primaria de las ECV en los adultos mayores. Además, destaca la necesidad de una evaluación en detalle de las fuentes de sesgos y una interpretación cuidadosa de los resultados en los estudios observacionales (49). En este sentido, la El Grupo de Trabajo de Servicios Preventivos de los Estados Unidos concluye que, la evidencia actual, es insuficiente para evaluar la relación entre los beneficios y los riesgos asociados al tratamiento con estatinas en la prevención primaria de los eventos de ECV-AS y la mortalidad en adultos con una edad de 76 años o más. ...
February 2024
Journal of Clinical Epidemiology
... Selection bias happens in observational studies focusing on the population seeking a PCR test, for example when both vaccination and infection encourage test-seeking. Conditioning on the collider (test-seeking behaviour) creates a negative correlation between vaccination and test positivity which would bias effectiveness upwards [18][19][20]. In our data, the vaccine coverage at week 52 was 52% in ≥ 65-yearold individuals, comparable to the nationwide figure of 50%, suggesting that vaccination did not influence testseeking [21]. ...
January 2023
Epidemiology
... The identification and estimation of statistical and causal estimands related to VE under the TND have been investigated in past work. Up until recently, logistic regression was the only statistical method proposed to adjust for confounders to estimate VE under this design [14][15][16]. Under strong modeling assumptions and conditional independence between vaccination and the condition of having some other infection leading to inclusion criteria, the logistic regression produces estimates of a conditional risk ratio of the association between vaccination and the probability of medically-attended symptomatic infection [1,17]. ...
December 2023
Vaccine
... We also included: self-esteem, impulsivity, and novelty-seeking (a genetic tendency to feel intense excitement and actively explore new or potentially rewarding experiences, while also avoiding monotony and possible punishment Cloninger [1987]). While these three variables were measured in the 12-th cycle, because they are considered personality traits and unlikely to vary considerably over time, they were included as baseline covariates Liu et al. [July, 2023]. Self-esteem was measured using Rosenberg's Self-Esteem Scale Rosenberg [1965]; higher values indicate higher self-esteem Racicot et al. [2013]. ...
December 2023
American Journal of Epidemiology
... Other reported assumptions were method specific, like conditional linearity of potential outcomes and conditional log-odds of study participation (and treatment for an observational source population) [9,44]. Further reported assumptions included no mediator outcome confounding in mediation analysis [57] and parametric assumptions about missingness [65,71], the causal structure [66], and data missing at random [60]. ...
November 2023
Statistics in Medicine