# Vanessa Didelez's research while affiliated with Leibniz Institute for Prevention Research and Epidemiology – BIPS and other places

**What is this page?**

This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

## Publications (91)

Background:
An important epidemiological question is understanding how vascular risk factors contribute to cognitive impairment. Using data from the Cardiovascular Health Cognition Study, we investigated how subclinical cardiovascular disease (sCVD) relates to cognitive impairment risk and the extent to which the hypothesized risk is mediated by t...

Variable selection in linear regression settings is a much discussed problem. Best subset selection (BSS) is often considered the intuitive 'gold standard', with its use being restricted only by its NP-hard nature. Alternatives such as the least absolute shrinkage and selection operator (Lasso) or the elastic net (Enet) have become methods of choic...

Background: Epidemiological studies often have missing data. Multiple imputation (MI) is a commonly-used strategy for such studies. MI guidelines for structuring the imputation model have focused on compatibility with the analysis model, but not on the need for the (compatible) imputation model(s) to be correctly specified. Standard (default) MI pr...

Background:
The efficacy of mammography screening in reducing breast cancer mortality has been demonstrated in randomized trials. However, treatment options - and hence prognosis - for advanced tumor stages as well as mammography techniques have considerably improved since completion of these trials. Consequently, the effectiveness of mammography...

Zusammenfassung
Hintergrund
Die „real world data“ (RWD), z. B. Krankenkassendaten, bieten reichhaltige Informationen zu gesundheitsrelevanten Faktoren und können die Basis für Studien zur Arzneimittelsicherheit, Wirksamkeit medizinischer Interventionen u. v. m. darstellen. Ein besonderer Vorteil ist die je nach Datenquelle größere Verallgemeinerba...

Background
Mendelian randomization (MR) is a powerful tool through which the causal effects of modifiable exposures on outcomes can be estimated from observational data. Most exposures vary throughout the life course, but MR is commonly applied to one measurement of an exposure (e.g. weight measured once between ages 40 and 60 years). It has been a...

Causal discovery algorithms estimate causal graphs from observational data. This can provide a valuable complement to analyses focusing on the causal relation between individual treatment‐outcome pairs. Constraint‐based causal discovery algorithms rely on conditional independence testing when building the graph. Until recently, these algorithms hav...

Objective
We aimed to evaluate the effectiveness of screening colonoscopy in reducing incidence of distal vs. proximal colorectal cancer (CRC) in persons aged 55–69.
Study Design and Setting
Using observational data from a German claims database (GePaRD), we emulated a target trial with two arms: Colonoscopy screening vs. no screening at baseline....

A Cohort Causal Graph (CCG) over the life-course from childhood to adolescence is estimated to identify potential causes of obesity and to determine promising targets for prevention strategies. We adapt a popular causal discovery algorithm to deal with missing values by multiple imputation and with temporal cohort structure. To estimate possible ca...

We consider continuous-time survival or more general event-history settings, where the aim is to infer the causal effect of a time-dependent treatment process. This is formalised as the effect on the outcome event of a (possibly hypothetical) intervention on the intensity of the treatment process, i.e. a stochastic intervention. To establish whethe...

Zusammenfassung
In Studien mit Sekundärdaten wie Abrechnungsdaten von Krankenkassen wird man häufig vor methodische Herausforderungen gestellt, die v. a. durch die Zeitabhängigkeit, aber auch durch ungemessenes Confounding entstehen. In diesem Paper stellen wir Strategien vor, um verschiedene Biasquellen zu vermeiden und um den durch ungemessenes C...

In competing event settings, a counterfactual contrast of cause-specific cumulative incidences quantifies the total causal effect of a treatment on the event of interest. However, effects of treatment on the competing event may indirectly contribute to this total effect, complicating its interpretation. We previously proposed the separable effects...

Purpose:
Investigating intended or unintended effects of sustained drug use is of high clinical relevance but remains methodologically challenging. This feasibility study aims to evaluate the usefulness of the parametric g-formula within a target trial for application to an extensive healthcare database in order to address various sources of time-...

Many statistical problems in causal inference involve a probability distribution other than the one from which data are actually observed; as an additional complication, the object of interest is often a marginal quantity of this other probability distribution. This creates many practical complications for statistical inference, even where the prob...

Huang proposes a method for assessing the impact of a point treatment on mortality either directly or mediated by occurrence of a nonterminal health event, based on data from a prospective cohort study in which the occurrence of the nonterminal health event may be preemptied by death but not vice versa. The author uses a causal mediation framework...

Causal discovery algorithms estimate causal graphs from observational data. This can provide a valuable complement to analyses focussing on the causal relation between individual treatment-outcome pairs. Constraint-based causal discovery algorithms rely on conditional independence testing when building the graph. Until recently, these algorithms ha...

In this guide, we present how to perform constraint-based causal discovery using three popular software packages: pcalg (with add-ons tpc and micd), bnlearn, and TETRAD. We focus on how these packages can be used with observational data and in the presence of mixed data (i.e., data where some variables are continuous, while others are categorical),...

Background: Variable selection in linear regression settings is a much discussed problem. Best subset selection (BSS) is often considered as an intuitively appealing ‘gold standard’, with its use being restricted mainly by its N P-hard nature. Instead, alternatives such as the least absolute shrinkage and selection operator (Lasso) or the elastic n...

Several of the hypothesized or studied exposures that may affect dementia risk are known to increase the risk of death. This may explain counterintuitive results, where exposures that are known to be harmful for mortality risk sometimes seem protective for the risk of dementia. Authors have attempted to explain these counterintuitive results as bia...

Causal mediation analysis is a useful tool for epidemiologic research, but it has been criticized for relying on a "cross-world" independence assumption that counterfactual outcome and mediator values are independent even in causal worlds where the exposure assignments for the outcome and mediator differ. This assumption is empirically difficult to...

Subclinical cardiovascular disease (sCVD) is associated with an increased risk of incident Mild Cognitive Impairment (MCI) or Alzheimer’s disease (AD). However, relatively little is known about the mechanisms by which sCVD leads to MCI or AD. One hypothesis is that cardiovascular disease (CVD) mediates the relationship between sCVD and MCI or AD. U...

We discuss causal mediation analyses for survival data and propose a new approach based on the additive hazards model. The emphasis is on a dynamic point of view, that is, understanding how the direct and indirect effects develop over time. Hence, importantly, we allow for a time varying mediator. To define direct and indirect effects in such a lon...

In time-to-event settings, the presence of competing events complicates the definition of causal effects. Here we propose the new separable effects to study the causal effect of a treatment on an event of interest. The separable direct effect is the treatment effect on the event of interest not mediated by its effect on the competing event. The sep...

In competing event settings, a counterfactual contrast of cause-specific cumulative incidences quantifies the total causal effect of a treatment on the event of interest. However, effects of treatment on the competing event may indirectly contribute to this total effect, complicating its interpretation. We previously proposed the Fseparable effects...

Causal discovery algorithms aim to identify causal relations from observational data and have become a popular tool for analysing genetic regulatory systems. In this work, we applied causal discovery to obtain novel insights into the genetic regulation underlying head‐and‐neck squamous cell carcinoma. Some methodological challenges needed to be res...

Causal mediation analysis is a useful tool for epidemiological research, but it has been criticized for relying on a "cross-world" independence assumption that is empirically difficult to verify and problematic to justify based on background knowledge. In the present article we aim to assist the applied researcher in understanding this assumption....

We consider estimation of a total causal effect from observational data via covariate adjustment. Ideally, adjustment sets are selected based on a given causal graph, reflecting knowledge of the underlying causal structure. Valid adjustment sets are, however, not unique. Recent research has introduced a graphical criterion for an 'optimal' valid ad...

In the current era, with increasing availability of results from genetic association studies, finding genetic instruments for inferring causality in observational epidemiology has become apparently simple. Mendelian randomisation (MR) analyses are hence growing in popularity and, in particular, methods that can incorporate multiple instruments are...

In the context of causal mediation analysis, prevailing notions of direct and indirect effects are based on nested counterfactuals. These can be problematic regarding interpretation and identifiability especially when the mediator is a time-dependent process and the outcome is survival or, more generally, a time-to-event outcome. We propose and dis...

We discuss causal mediation analyses for survival data and propose a new approach based on the additive hazards model. The emphasis is on a dynamic point of view, that is, understanding how the direct and indirect effects develop over time. Hence, importantly, we allow for a time varying mediator. To define direct and indirect effects in such a lon...

In time-to-event settings, the presence of competing events complicates the definition of causal effects. Here we propose the new separable effects to study the causal effect of a treatment on an event of interest. The separable direct effect is the treatment effect on the event of interest not mediated by its effect on the competing event. The sep...

In Mendelian randomization (MR), inference about causal relationship between a phenotype of interest and a response or disease outcome can be obtained by constructing instrumental variables from genetic variants. However, MR inference requires three assumptions, one of which is that the genetic variants only influence the outcome through phenotype...

When causal effects are to be estimated from observational data, we have to adjust for confounding. A central aim of covariate selection for causal inference is therefore to determine a set that is sufficient for confounding adjustment, but other aims such as efficiency or robustness can be important as well. In this paper, we review six general ap...

‘Sensitivity of treatment recommendations to bias in network meta‐analysis: A, Appendices’.

Network meta-analysis (NMA) pools evidence on multiple treatments to estimate relative treatment effects. Included studies are typically assessed for risk of bias; however, this provides no indication of the impact of potential bias on a decision based on the NMA. We propose methods to derive bias adjustment thresholds which measure the smallest ch...

In Mendelian randomization (MR), genetic variants are used to construct instrumental variables, which enable inference about the causal relationship between a phenotype of interest and a response or disease outcome. However, standard MR inference requires several assumptions, including the assumption that the genetic variants only influence the res...

Likelihood factors that can be disregarded for inference are termed ignorable. We demonstrate that close ties exist between ignorability and identification of causal effects by covariate adjustment. A graphical condition, stability, plays a role analogous to that of missingness at random, but is applicable to general longitudinal data. Our formulat...

In one procedure for finding the maximal prime decomposition of a Bayesian network or undirected graphical model, the first step is to create a minimal triangulation of the network, and a common and straightforward way to do this is to create a triangulation that is not necessarily minimal and then thin this triangulation by removing excess edges....

This short paper proves inequalities that restrict the magnitudes of the partial correlations in star-shaped structures in Gaussian graphical models. These inequalities have to be satisfied by distributions that are used for generating simulated data to test structure-learning algorithms, but methods that have been used to create such distributions...

Two-stage least squares (TSLS) estimators and variants thereof are widely
used to infer the effect of an exposure on an outcome using instrumental
variables (IVs). They belong to a wider class of two-stage IV estimators, which
are based on fitting a conditional mean model for the exposure, and then using
the fitted exposure values along with the co...

A parameter in a statistical model is identified if its value can be uniquely determined from the distribution of the observable
data. We consider the context of an instrumental variable analysis with a binary outcome for estimating a causal risk ratio.
The semiparametric generalized method of moments and structural mean model frameworks use estima...

Abstract There has been much recent interest in systems biology for investigating the structure of gene regulatory systems. Such networks are often formed of specific patterns, or network motifs, that are interesting from a biological point of view. Our aim in the present paper is to compare statistical methods specifically with regard to the quest...

Mendelian randomization studies estimate causal effects using genetic variants as instruments. Instrumental variable methods are straightforward for linear models, but epidemiologists often use odds ratios to quantify effects. Also, odds ratios are often the quantities reported in meta-analyses. Many applications of Mendelian randomization dichotom...

We discuss why it is not always obvious how to simulate longitudinal data from a general marginal structural model (MSM) for a survival outcome while ensuring that the data exhibit complications due to time‐dependent confounding. On the basis of the relation between a directed acyclic graph and an MSM, we suggest a data‐generating process that sati...

Mendelian randomisation is a form of instrumental variable analysis that estimates the causal effect of an intermediate phenotype or exposure on an outcome or disease in the presence of unobserved confounding, using a genetic variant as the instrument. A Bayesian approach allows current knowledge to be incorporated into the analysis in the form of...

In this paper we review the notion of direct causal effect as introduced by
Pearl (2001). We show how it can be formulated without counterfactuals, using
intervention indicators instead. This allows to consider the natural direct
effect as a special case of sequential treatments discussed by Dawid and
Didelez (2005) which immediately yields conditi...

Directed possibly cyclic graphs have been proposed by Didelez (2000) and
Nodelmann et al. (2002) in order to represent the dynamic dependencies among
stochastic processes. These dependencies are based on a generalization of
Granger-causality to continuous time, first developed by Schweder (1970) for
Markov processes, who called them local dependenc...

We propose a definition of causality for time series in terms of the effect
of an intervention in one component of a multivariate time series on another
component at some later point in time. Conditions for identifiability,
comparable to the back-door and front-door criteria, are presented and can also
be verified graphically. Computation of the ca...

We consider conditions that allow us to find an optimal strategy for
sequential decisions from a given data situation. For the case where all
interventions are unconditional (atomic), identifiability has been discussed by
Pearl & Robins (1995). We argue here that an optimal strategy must be
conditional, i.e. take the information available at each d...

In this chapter we examine the problem of time-varying confounding, and one method (structural nested accelerated failure time models, estimated using and also known as g-estimation) which may be used to overcome it. A practical example is given, and the methodology demonstrated. Cautions as to the use of g-estimation are provided, and alternative...

Abstract We extend Pearl's criticisms of principal stratification analysis as a method for interpreting and adjusting for intermediate variables in a causal analysis. We argue that this can be meaningful only in those rare cases that involve strong functional dependence, and even then may not be appropriate.

Instrumental variables can be used to make inferences about causal effects in the presence of unmeasured confounding. For a model in which the instrument, intermediate/treatment, and outcome variables are all binary, Balke and Pearl (Journal of the American Statistical Association, 1997, 92: 1172–1176) derived nonparametric bounds for the intervent...

In this paper, the authors describe different instrumental variable (IV) estimators of causal risk ratios and odds ratios
with particular attention to methods that can handle continuously measured exposures. The authors present this discussion
in the context of a Mendelian randomization analysis of the effect of body mass index (BMI; weight (kg)/he...

Investigations into the aetiology of common complex diseases based on observational data should make use of any opportunity to reduce bias due to unobserved confounding. In this context, it has become popular to exploit instrumental variable (IV) methods via Mendelian randomization but the key to success lies in finding suitable genetic instruments...

We consider situations where data have been collected such that the sampling
depends on the outcome of interest and possibly further covariates, as for
instance in case-control studies. Graphical models represent assumptions about
the conditional independencies among the variables. By including a node for the
sampling indicator, assumptions about s...

Detection and assessment of the effect of a modifiable risk factor on a disease with view to informing public health intervention policies are of fundamental concern in aetiological epidemiology. In order to have solid evidence that such a public health intervention has the desired effect, it is necessary to ascertain that an observed association o...

When estimating the effect of treatment on HIV using data from observational studies, standard methods may produce biased estimates due to the presence of time-dependent confounders. Such confounding can be present when a covariate, affected by past exposure, is both a predictor of the future exposure and the outcome. One example is the CD4 cell co...

Instrumental variable (IV) methods are becoming increasingly popular as they seem to offer the only viable way to overcome the problem of unobserved confounding in observational studies. However, some attention has to be paid to the details, as not all such methods target the same causal parameters and some rely on more restrictive parametric assum...

We consider the problem of learning about and comparing the consequences of dynamic treatment strategies on the basis of observational data. We formulate this within a probabilistic decision-theoretic framework. Our approach is compared with related work by Robins and others: in particular, we show how Robins's 'G-computation' algorithm arises natu...

Nuala Sheehan and colleagues describe how Mendelian randomization provides an alternative way of dealing with the problems of observational studies, especially confounding.

ML–estimation of regression parameters with incomplete covariate information usually requires a distributional assumption regarding the concerned covariates that implies a source of misspecification. Semiparametric procedures avoid such assumptions at the expense of efficiency. In this paper a simulation study with small sample size is carried out...

A new class of graphical models capturing the dependence structure of events that occur in time is proposed. The graphs represent so-called local independences, meaning that the intensities of certain types of events are independent of some (but not necessarilly all) events in the past. This dynamic concept of independence is asymmetric, similar to...

In epidemiological research, the causal effect of a modifiable phenotype or exposure on a disease is often of public health interest. Randomized controlled trials to investigate this effect are not always possible and inferences based on observational data can be confounded. However, if we know of a gene closely linked to the phenotype without dire...

Composable Markov processes were introduced by Schweder (1970) in order to capture the idea that a process can be composed of different components where some of these only depend on a subset of the other components. Here we propose a graphical representation of this kind of dependence which has been called 'local dependence'. It is shown that the g...

We present step-wise test procedures based on the Bonferroni-Holm [see S. Holm, Scand. J. Stat., Theory Appl. 6, 65–70 (1979; Zbl 0402.62058)] principle for multi-way ANOVA-type models. It is shown for two plausible modifications that the multiple level α is preserved. These theoretical results are supplemented by a simulation study, in a two-way A...

We investigate the possibility of exploiting partial correlation graphs for identifying interpretable latent variables underlying a multivariate time series. It is shown how the collapsibility and separation properties of partial correlation graphs can be used to understand the relation between a factor model and the structure among the observable...

Abstract In epidemiological research, the eect of a potentially modiable phenotype or exposure on a particular outcome or disease is often of public health interest. Inferences on this eect can be distorted in the presence of confounders aecting both phenotype and disease. Randomised controlled trials are not always a viable option so reliable infe...

Latent variable techniques are helpful to reduce high-dimensional time series to a few relevant variables that are easier
to model and analyze. An inherent problem is the identifiability of the model and the interpretation of the latent variables.
We apply graphical models to find the essential relations in the data and to deduce suitable assumptio...

CG-regressions are multivariate regression models for mixed continuous and discrete responses that result from conditioning in the class of conditional Gaussian (CG) models. Their conditional independence structure can be read off a marked graph. The property of collapsibility, in this context, means that the multivariate CG-regression can be decom...

Invited discussion of the paper by Susan Murphy "Optimal dynamic treatment regimes"

Quantitative research especially in the social, but also in the biological sciences has been limited by the availability and applicability of analytic techniques that elaborate interactions among behaviours, treatment effects, and mediating variables. This gap has been filled by a newly developed statistical technique, known as graphical interactio...

For ethical or practical reasons, randomised cotrolled trials are not always an option to test epidemiological hypotheses. Epidemiologists are consequently faced with the problem of how to make causal inferences from observational data, particularly when confounding is present and not fully understood. The method of instrumental variables can be ex...

We consider situations where data have been collected such that the sampling depends on the outcome of interest and possibly fur- ther covariates, as for instance in case-control studies. Graphical mod- els represent assumptions about the conditional independencies among the variables. By including a node for the sampling indicator, assump- tions a...

This and other timely subjects are taken up in the present course. The main statistical methods presented belong to the area of graphical models, which is a tool for describing the relationship between variables, genes and other entities in genetics and epidemiology. Such models are increasingly used due their nice representation of statistical rel...

Mendelian randomisation is an instrumental variable method using a genetic variant as instrument to estimate the causal effect of exposure on disease in the presence of unobserved confounding. We discuss the modelling assumptions required by various estimators based on instrumental variables, particularly with regard to their plausibility for typic...

## Citations

... The PC-algorithm had to be modified for application to multiply imputed cohort data [20][21][22] . To account for the cohort structure we used our temporal PC-algorithm tPC 22 ; this was further combined with functions from micd 21 to deal with multiply imputed data containing a mix of categorical and continuous variables. ...

... We are therefore unable to empirically explore this interpretation further here. MR still produces a valid test for the average lifetime effect of exposure to genetically predicted levels of the exposure [64]. We therefore believe that our results are a valid, if blunt, measure of the total longitudinal effect up to recruitment. ...