Fan Li

Fan Li
Yale University | YU · Department of Biostatistics

Doctor of Philosophy

About

187
Publications
15,645
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,309
Citations
Additional affiliations
January 2015 - September 2020
Duke University
Position
  • PhD Student

Publications

Publications (187)
Article
Patient-centered outcomes, such as quality of life and length of hospital stay, are the focus in a wide array of clinical studies. However, participants in randomized trials for elderly or critically and severely ill patient populations may have truncated or undefined non-mortality outcomes if they do not survive through the measurement time point....
Preprint
Full-text available
Cluster-randomized trials (CRTs) are a well-established class of designs for evaluating large-scale, community-based research questions. An essential task in planning these trials is determining the required number of clusters and cluster sizes to achieve sufficient statistical power for detecting a clinically relevant effect size. Compared to meth...
Preprint
Full-text available
Intercurrent events, common in clinical trials and observational studies, affect the existence or interpretation of final outcomes. Principal stratification addresses these challenges by defining local average treatment effects within latent subpopulations, but often relies on restrictive assumptions such as monotonicity and counterfactual intermed...
Article
Full-text available
Background A key challenge for many critical care clinical trials is that some patients will die before their outcome is fully measured. This is referred to as “truncation due to death” and must be accounted for in both the treatment effect definition (i.e. the estimand), as well as the statistical analysis approach. It is unknown which analytic ap...
Article
Full-text available
Background Cluster randomized trials (CRTs) are increasingly important for evaluating interventions embedded in health care systems. An essential parameter in sample size calculation to detect both overall and heterogeneous treatment effects for CRTs is the intra-cluster correlation coefficient (ICC) of both outcome and covariates of interest. Howe...
Article
Importance Acute kidney injury (AKI) is a common complication during hospitalization and is associated with adverse outcomes. Objective To evaluate whether diagnostic and therapeutic recommendations sent by a kidney action team through the electronic health record improve outcomes among patients hospitalized with AKI compared with usual care. Des...
Article
While inverse probability of treatment weighting (IPTW) is a commonly used approach for treatment comparisons in observational data, the resulting estimates may be subject to bias and excessively large variance under lack of overlap. By smoothly down-weighting units with extreme propensity scores, i.e., those that are close (or equal) to zero or on...
Preprint
Full-text available
In clinical trials, the observation of participant outcomes may frequently be hindered by death, leading to ambiguity in defining a scientifically meaningful final outcome for those who die. Principal stratification methods are valuable tools for addressing the average causal effect among always-survivors, i.e., the average treatment effect among a...
Article
Background/Aims Stepped-wedge cluster randomized trials tend to require fewer clusters than standard parallel-arm designs due to the switches between control and intervention conditions, but there are no recommendations for the minimum number of clusters. Trials randomizing an extremely small number of clusters are not uncommon, but the justificati...
Article
Technological advancements in noninvasive imaging facilitate the construction of whole brain interconnected networks, known as brain connectivity. Existing approaches to analyze brain connectivity frequently disaggregate the entire network into a vector of unique edges or summary measures, leading to a substantial loss of information. Motivated by...
Article
Many individually randomized group treatment (IRGT) trials randomly assign individuals to study arms but deliver treatments via shared agents, such as therapists, surgeons, or trainers. Post‐randomization interactions induce correlations in outcome measures between participants sharing the same agent. Agents can be nested in or crossed with trial a...
Article
Full-text available
Linear mixed models are commonly used in analyzing stepped-wedge cluster randomized trials. A key consideration for analyzing a stepped-wedge cluster randomized trial is accounting for the potentially complex correlation structure, which can be achieved by specifying random-effects. The simplest random effects structure is random intercept but more...
Article
Estimands can help clarify the interpretation of treatment effects and ensure that estimators are aligned with the study's objectives. Cluster-randomised trials require additional attributes to be defined within the estimand compared to individually randomised trials, including whether treatment effects are marginal or cluster-specific, and whether...
Article
The cluster randomized crossover design has been proposed to improve efficiency over the traditional parallel-arm cluster randomized design. While statistical methods have been developed for designing cluster randomized crossover trials, they have exclusively focused on testing the overall average treatment effect, with little attention to differen...
Article
Importance Aortic stenosis (AS) is a major public health challenge with a growing therapeutic landscape, but current biomarkers do not inform personalized screening and follow-up. A video-based artificial intelligence (AI) biomarker (Digital AS Severity index [DASSi]) can detect severe AS using single-view long-axis echocardiography without Doppler...
Article
Full-text available
Understanding whether and how treatment effects vary across subgroups is crucial to inform clinical practice and recommendations. Accordingly, the assessment of heterogeneous treatment effects based on pre-specified potential effect modifiers has become a common goal in modern randomized trials. However, when one or more potential effect modifiers...
Preprint
Full-text available
Principal stratification is a popular framework for causal inference in the presence of an intermediate outcome. While the principal average treatment effects have traditionally been the default target of inference, it may not be sufficient when the interest lies in the relative favorability of one potential outcome over the other within the princi...
Article
Full-text available
Multi-period cluster randomized trials (CRTs) are increasingly used for the evaluation of interventions delivered at the group level. While generalized estimating equations (GEE) are commonly used to provide population-averaged inference in CRTs, there is a gap of general methods and statistical software tools for power calculation based on multi-p...
Article
Generalized estimating equations (GEEs) provide a useful framework for estimating marginal regression parameters based on data from cluster randomized trials (CRTs), but they can result in inaccurate parameter estimates when some outcomes are informatively missing. Existing techniques to handle missing outcomes in CRTs rely on correct specification...
Preprint
Full-text available
In longitudinal observational studies with a time-to-event outcome, a common objective in causal analysis is to estimate the causal survival curve under hypothetical intervention scenarios within the study cohort. The g-formula is a particularly useful tool for this analysis. To enhance the traditional parametric g-formula approach, we developed a...
Article
Full-text available
Introduction Focal segmental glomerulosclerosis (FSGS), the most common primary glomerular disease leading to end-stage kidney disease (ESKD), is characterized by podocyte injury and depletion, whereas minimal change disease (MCD) has better outcomes despite podocyte injury. Identifying mechanisms capable of preventing podocytopenia during injury c...
Article
Recurrent events are common in clinical studies and are often subject to terminal events. In pragmatic trials, participants are often nested in clinics and can be susceptible or structurally unsusceptible to the recurrent events. We develop a Bayesian shared random effects model to accommodate this complex data structure. To achieve robustness, we...
Article
Full-text available
Introduction Despite evidence supporting the benefits of marriage on cardiovascular health, the impact of marital/partner status on the long-term readmission of young acute myocardial infarction (AMI) survivors is less clear. We examined the association between marital/partner status and 1-year all-cause readmission and explored sex differences amo...
Article
When the distributions of treatment effect modifiers differ between a randomized trial and an external target population, the sample average treatment effect in the trial may be substantially different from the target population average treatment, and accurate estimation of the latter requires adjusting for the differential distribution of effect m...
Article
Stepped wedge design is a popular research design that enables a rigorous evaluation of candidate interventions by using a staggered cluster randomization strategy. While analytical methods were developed for designing stepped wedge trials, the prior focus has been solely on testing for the average treatment effect. With a growing interest on forma...
Article
The two‐stage preference design (TSPD) enables inference for treatment efficacy while allowing for incorporation of patient preference to treatment. It can provide unbiased estimates for selection and preference effects, where a selection effect occurs when patients who prefer one treatment respond differently than those who prefer another, and a p...
Article
Individually randomized group treatment (IRGT) trials, in which the clustering of outcome is induced by group‐based treatment delivery, are increasingly popular in public health research. IRGT trials frequently incorporate longitudinal measurements, of which the proper sample size calculations should account for correlation structures reflecting bo...
Article
Background/Aims The stepped-wedge cluster randomized trial (SW-CRT), in which clusters are randomized to a time at which they will transition to the intervention condition – rather than a trial arm – is a relatively new design. SW-CRTs have additional design and analytical considerations compared to conventional parallel arm trials. To inform futur...
Article
In many medical studies, the outcome measure (such as quality of life, QOL) for some study participants becomes informatively truncated (censored, missing, or unobserved) due to death or other forms of dropout, creating a nonignorable missing data problem. In such cases, the use of a composite outcome or imputation methods that fill in unmeasurable...
Article
Objectives Propensity score (PS) weighting methods are commonly used to adjust for confounding in observational treatment comparisons. However, in the setting of substantial covariate imbalance, PS values may approach 0 and 1, yielding extreme weights and inflated variance of the estimated treatment effect. Adaptations of the standard inverse proba...
Article
Full-text available
Cluster-randomized trials (CRTs) often allocate intact clusters of participants to treatment or control conditions and are increasingly used to evaluate healthcare delivery interventions. While previous studies have developed sample size methods for testing confirmatory hypotheses of treatment effect heterogeneity in CRTs (i.e., targeting the diffe...
Preprint
Full-text available
Background The timely identification of aortic stenosis (AS) and disease stage that merits intervention requires frequent echocardiography. However, there is no strategy to personalize the frequency of monitoring needed. Objectives To explore the role of AI-enhanced two-dimensional-echocardiography in stratifying the risk of AS development and pro...
Article
In an individually randomized group treatment (IRGT) trial, participant outcomes can be positively correlated due to, for example, shared therapists in treatment delivery. Oftentimes, because of limited treatment resources or participants at one location, an IRGT trial can be carried out across multiple centers. This design can be subject to potent...
Article
Cluster randomized trials (CRTs) refer to a popular class of experiments in which randomization is carried out at the group level. While methods have been developed for planning CRTs to study the average treatment effect, and more recently, to study the heterogeneous treatment effect, the development for the latter objective has currently been limi...
Article
A population‐averaged additive subdistribution hazards model is proposed to assess the marginal effects of covariates on the cumulative incidence function and to analyze correlated failure time data subject to competing risks. This approach extends the population‐averaged additive hazards model by accommodating potentially dependent censoring due t...
Article
Full-text available
Background Stress experienced in a marriage or committed relationship may be associated with worse patient‐reported outcomes after acute myocardial infarction (AMI), but little is known about this association in young adults (≤55 years) with AMI. Methods and Results We used data from VIRGO (Variation in Recovery: Role of Gender on Outcomes of Youn...
Preprint
Full-text available
Linear mixed models are commonly used in analyzing stepped-wedge cluster randomized trials (SW-CRTs). A key consideration for analyzing a SW-CRT is accounting for the potentially complex correlation structure, which can be achieved by specifying a random effects structure. Common random effects structures for a SW-CRT include random intercept, rand...
Article
Background: Recent work has shown that cluster-randomised trials can estimate two distinct estimands: the participant-average and cluster-average treatment effects. These can differ when participant outcomes or the treatment effect depends on the cluster size (termed informative cluster size). In this case, estimators that target one estimand (suc...
Preprint
Full-text available
Introduction: Despite evidence supporting the benefits of marriage on cardiovascular health, the impact of marital/partner status on the long-term readmission of young acute myocardial infarction (AMI) survivors is less clear. We aimed to examine the association between marital/partner status and 1-year all-cause readmission, and explore sex differ...
Article
Assessing heterogeneity in the effects of treatments has become increasingly popular in the field of causal inference and carries important implications for clinical decision-making. While extensive literature exists for studying treatment effect heterogeneity when outcomes are fully observed, there has been limited development in tools for estimat...
Article
Cluster randomized trials (CRTs) are studies where treatment is randomized at the cluster level but outcomes are typically collected at the individual level. When CRTs are employed in pragmatic settings, baseline population characteristics may moderate treatment effects, leading to what is known as heterogeneous treatment effects (HTEs). Pre-specif...
Preprint
Full-text available
Stepped wedge cluster randomized experiments represent a class of unidirectional crossover designs that are increasingly adopted for comparative effectiveness and implementation science research. Although stepped wedge cluster randomized experiments have become popular, definitions of estimands and robust methods to target clearly-defined estimands...
Article
An important consideration in the design and analysis of randomized trials is the need to account for outcome observations being positively correlated within groups or clusters. Two notable types of designs with this consideration are individually randomized group treatment trials and cluster randomized trials. While sample size methods for testing...
Article
Full-text available
Acute kidney injury is common among hospitalized individuals, particularly those exposed to certain medications, and is associated with substantial morbidity and mortality. In a pragmatic, open-label, National Institutes of Health-funded, parallel group randomized controlled trial (clinicaltrials.gov NCT02771977), we investigate whether an automate...
Article
It is well-known that designing a cluster randomized trial (CRT) requires an advance estimate of the intra-cluster correlation coefficient (ICC). In the case of longitudinal CRTs, where outcomes are assessed repeatedly in each cluster over time, estimates for more complex correlation structures are required. Three common types of correlation struct...
Article
Background: Targeting short-term improvements in multicomponent risk scores for mortality in patients with pulmonary arterial hypertension (PAH) could result in improved long-term outcomes. We aimed to determine whether PAH risk scores were adequate surrogates for clinical worsening or mortality outcomes in PAH randomised clinical trials (RCTs)....
Article
Full-text available
The currently recommended dose of dexamethasone for patients with severe or critical COVID-19 is 6 mg per day (mg/d) regardless of patient features and variation. However, patients with severe or critical COVID-19 are heterogenous in many ways (e.g., age, weight, comorbidities, disease severity, and immune features). Thus, it is conceivable that a...
Article
Cluster-randomized trials (CRTs) involve randomizing entire groups of participants-called clusters-to treatment arms but are often comprised of a limited or fixed number of available clusters. While covariate adjustment can account for chance imbalances between treatment arms and increase statistical efficiency in individually randomized trials, an...
Preprint
Full-text available
Causal mediation analysis is widely used in health science research to evaluate the extent to which an intermediate variable explains an observed exposure-outcome relationship. However, the validity of analysis can be compromised when the exposure is measured with error, which is common in health science studies. This article investigates the impac...
Article
Full-text available
Background Detecting treatment effect heterogeneity is an important objective in cluster randomized trials and implementation research. While sample size procedures for testing the average treatment effect accounting for participant attrition assuming missing completely at random or missing at random have been previously developed, the impact of at...
Preprint
Full-text available
While the inverse probability of treatment weighting (IPTW) is a commonly used approach for treatment comparisons in observational data, the resulting estimates may be subject to bias and excessively large variance when there is lack of overlap in the propensity score distributions. By smoothly down-weighting the units with extreme propensity score...
Article
Background and objectives: Marginal models with generalized estimating equations (GEE) are usually recommended for analyzing correlated ordinal outcomes which are commonly seen in a longitudinal study or clustered randomized trial (CRT). Within-cluster association is often of interest in longitudinal studies or CRTs, and can be estimated with pair...
Preprint
Full-text available
Estimands can help clarify the interpretation of treatment effects and ensure that estimators are aligned to the study's objectives. Cluster randomised trials require additional attributes to be defined within the estimand compared to individually randomised trials, including whether treatment effects are marginal or cluster specific, and whether t...
Article
Full-text available
Background The effectiveness of malaria vector control interventions is often evaluated using cluster randomized trials (CRT) with outcomes assessed using repeated cross-sectional surveys. A key requirement for appropriate design and analysis of longitudinal CRTs is accounting for the intra-cluster correlation coefficient (ICC). In addition to exch...
Article
Background and objectives: Generalized estimating equations (GEE) are used to analyze correlated outcomes in marginal regression models with population-averaged interpretations of exposure effects. Limitations of popular software for GEE include: (i) user choice is restricted to a small set of within-cluster pairwise correlation (intra-class corre...
Article
The difference method is used in mediation analysis to quantify the extent to which a mediator explains the mechanisms underlying the pathway between an exposure and an outcome. In many health science studies, the exposures are almost never measured without error, which can result in biased effect estimates. This article investigates methods for me...
Article
Objectives: In stepped-wedge cluster randomized trials (SW-CRTs), clusters are randomized not to treatment and control arms, but to sequences dictating the times of crossing from control to intervention conditions. Randomization is an essential feature of this design but application of standard methods to promote and report on balance at baseline...
Conference Paper
Background: Marital stress is associated with worse cardiac outcomes in young adults (≤55 years) with acute myocardial infarction (AMI), but whether psychosocial factors mediate this association remains largely unknown. We conducted a mediation analysis to investigate whether marital stress worsened quality of life (QoL) after AMI by increasing the...
Article
Full-text available
Many studies encounter clustering due to multicenter enrollment and non-mortality outcomes, such as quality-of-life, that are truncated due to death; i.e., missing not at random and nonignorable. Traditional missing data methods and target causal estimands are suboptimal for statistical inference in the presence of these combined issues, which are...
Preprint
Full-text available
Cluster randomized trials (CRTs) are studies where treatment is randomized at the cluster level but outcomes are typically collected at the individual level. When CRTs are employed in pragmatic settings, baseline population characteristics may moderate treatment effects, leading to what is known as heterogeneous treatment effects (HTEs). Pre-specif...
Article
Full-text available
Most reported treatment effects in medical research studies are ambiguously defined, which can lead to misinterpretation of study results. This is because most studies do not attempt to describe what the treatment effect represents, and instead require readers to deduce this based on the reported statistical methods. However, this approach is fraug...
Article
Full-text available
Introduction: To improve dementia care delivery for persons across all backgrounds, it is imperative that health equity is integrated into pragmatic trials. Methods: We reviewed 62 pragmatic trials of people with dementia published 2014 to 2019. We assessed health equity in the objectives; design, conduct, analysis; and reporting using PROGRESS-...
Article
Full-text available
Background: Treatment of severe inpatient hypertension (HTN) that develops during hospitalization is not informed by guidelines. Intravenous (i.v.) antihypertensives are used to manage severe HTN even in the absence of acute target organ damage; however they may result in unpredictable blood pressure (BP) reduction and cardiovascular events. Our g...
Article
Stepped wedge cluster randomized trials (SW-CRTs) are increasingly being used to evaluate interventions in medical, public health, educational, and social science contexts. With the longitudinal and crossover natures of an SW-CRT, complex analysis techniques are often needed, which makes appropriately powering SW-CRTs challenging. In this article,...
Article
Multivariate outcomes are common in pragmatic cluster randomized trials. While sample size calculation procedures for multivariate outcomes exist under parallel assignment, none have been developed for a stepped wedge design. In this article, we present computationally efficient power and sample size procedures for stepped wedge cluster randomized...
Article
Cluster randomized trials (CRTs) frequently recruit a small number of clusters, therefore necessitating the application of small‐sample corrections for valid inference. A recent systematic review indicated that CRTs reporting right‐censored, time‐to‐event outcomes are not uncommon and that the marginal Cox proportional hazards model is one of the c...
Article
Simulation studies play an important role in evaluating the performance of statistical models developed for analyzing complex survival data such as those with competing risks and clustering. This article aims to provide researchers with a basic understanding of competing risks data generation, techniques for inducing cluster-level correlation, and...
Article
Full-text available
Background: Psychosocial stress is associated with worse cardiac outcomes, but little is known about the prognostic impact of marital stress in young adults (≤55 years) with acute myocardial infarction (AMI). We investigated the association between marital stress and 1-year health outcomes in young AMI survivors. Methods: We used data from the VIRG...
Article
A stepped wedge cluster randomized trial is a unidirectional crossover study in which timings of treatment initiation for clusters are randomized. Because the timing of treatment initiation is different for each cluster, an emerging question is whether the treatment effect depends on the exposure time, namely, the time duration since the initiation...
Article
Full-text available
Stepped wedge designs have uni-directional crossovers at randomly assigned time points (steps) where clusters switch from control to intervention condition. Incomplete stepped wedge designs are increasingly used in cluster randomized trials of health care interventions and have periods without data collection due to logistical, resource and patient...