
Fan LiYale University | YU · Department of Biostatistics
Fan Li
Doctor of Philosophy
About
187
Publications
15,645
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,309
Citations
Introduction
Skills and Expertise
Additional affiliations
January 2015 - September 2020
Publications
Publications (187)
Patient-centered outcomes, such as quality of life and length of hospital stay, are the focus in a wide array of clinical studies. However, participants in randomized trials for elderly or critically and severely ill patient populations may have truncated or undefined non-mortality outcomes if they do not survive through the measurement time point....
Cluster-randomized trials (CRTs) are a well-established class of designs for evaluating large-scale, community-based research questions. An essential task in planning these trials is determining the required number of clusters and cluster sizes to achieve sufficient statistical power for detecting a clinically relevant effect size. Compared to meth...
Intercurrent events, common in clinical trials and observational studies, affect the existence or interpretation of final outcomes. Principal stratification addresses these challenges by defining local average treatment effects within latent subpopulations, but often relies on restrictive assumptions such as monotonicity and counterfactual intermed...
Background
A key challenge for many critical care clinical trials is that some patients will die before their outcome is fully measured. This is referred to as “truncation due to death” and must be accounted for in both the treatment effect definition (i.e. the estimand), as well as the statistical analysis approach. It is unknown which analytic ap...
Background
Cluster randomized trials (CRTs) are increasingly important for evaluating interventions embedded in health care systems. An essential parameter in sample size calculation to detect both overall and heterogeneous treatment effects for CRTs is the intra-cluster correlation coefficient (ICC) of both outcome and covariates of interest. Howe...
Importance
Acute kidney injury (AKI) is a common complication during hospitalization and is associated with adverse outcomes.
Objective
To evaluate whether diagnostic and therapeutic recommendations sent by a kidney action team through the electronic health record improve outcomes among patients hospitalized with AKI compared with usual care.
Des...
While inverse probability of treatment weighting (IPTW) is a commonly used approach for treatment comparisons in observational data, the resulting estimates may be subject to bias and excessively large variance under lack of overlap. By smoothly down-weighting units with extreme propensity scores, i.e., those that are close (or equal) to zero or on...
In clinical trials, the observation of participant outcomes may frequently be hindered by death, leading to ambiguity in defining a scientifically meaningful final outcome for those who die. Principal stratification methods are valuable tools for addressing the average causal effect among always-survivors, i.e., the average treatment effect among a...
Background/Aims
Stepped-wedge cluster randomized trials tend to require fewer clusters than standard parallel-arm designs due to the switches between control and intervention conditions, but there are no recommendations for the minimum number of clusters. Trials randomizing an extremely small number of clusters are not uncommon, but the justificati...
Technological advancements in noninvasive imaging facilitate the construction of whole brain interconnected networks, known as brain connectivity. Existing approaches to analyze brain connectivity frequently disaggregate the entire network into a vector of unique edges or summary measures, leading to a substantial loss of information. Motivated by...
Many individually randomized group treatment (IRGT) trials randomly assign individuals to study arms but deliver treatments via shared agents, such as therapists, surgeons, or trainers. Post‐randomization interactions induce correlations in outcome measures between participants sharing the same agent. Agents can be nested in or crossed with trial a...
Linear mixed models are commonly used in analyzing stepped-wedge cluster randomized trials. A key consideration for analyzing a stepped-wedge cluster randomized trial is accounting for the potentially complex correlation structure, which can be achieved by specifying random-effects. The simplest random effects structure is random intercept but more...
Estimands can help clarify the interpretation of treatment effects and ensure that estimators are aligned with the study's objectives. Cluster-randomised trials require additional attributes to be defined within the estimand compared to individually randomised trials, including whether treatment effects are marginal or cluster-specific, and whether...
The cluster randomized crossover design has been proposed to improve efficiency over the traditional parallel-arm cluster randomized design. While statistical methods have been developed for designing cluster randomized crossover trials, they have exclusively focused on testing the overall average treatment effect, with little attention to differen...
Importance
Aortic stenosis (AS) is a major public health challenge with a growing therapeutic landscape, but current biomarkers do not inform personalized screening and follow-up. A video-based artificial intelligence (AI) biomarker (Digital AS Severity index [DASSi]) can detect severe AS using single-view long-axis echocardiography without Doppler...
Understanding whether and how treatment effects vary across subgroups is crucial to inform clinical practice and recommendations. Accordingly, the assessment of heterogeneous treatment effects based on pre-specified potential effect modifiers has become a common goal in modern randomized trials. However, when one or more potential effect modifiers...
Principal stratification is a popular framework for causal inference in the presence of an intermediate outcome. While the principal average treatment effects have traditionally been the default target of inference, it may not be sufficient when the interest lies in the relative favorability of one potential outcome over the other within the princi...
Multi-period cluster randomized trials (CRTs) are increasingly used for the evaluation of interventions delivered at the group level. While generalized estimating equations (GEE) are commonly used to provide population-averaged inference in CRTs, there is a gap of general methods and statistical software tools for power calculation based on multi-p...
Generalized estimating equations (GEEs) provide a useful framework for estimating marginal regression parameters based on data from cluster randomized trials (CRTs), but they can result in inaccurate parameter estimates when some outcomes are informatively missing. Existing techniques to handle missing outcomes in CRTs rely on correct specification...
In longitudinal observational studies with a time-to-event outcome, a common objective in causal analysis is to estimate the causal survival curve under hypothetical intervention scenarios within the study cohort. The g-formula is a particularly useful tool for this analysis. To enhance the traditional parametric g-formula approach, we developed a...
Introduction
Focal segmental glomerulosclerosis (FSGS), the most common primary glomerular disease leading to end-stage kidney disease (ESKD), is characterized by podocyte injury and depletion, whereas minimal change disease (MCD) has better outcomes despite podocyte injury. Identifying mechanisms capable of preventing podocytopenia during injury c...
Recurrent events are common in clinical studies and are often subject to terminal events. In pragmatic trials, participants are often nested in clinics and can be susceptible or structurally unsusceptible to the recurrent events. We develop a Bayesian shared random effects model to accommodate this complex data structure. To achieve robustness, we...
Introduction
Despite evidence supporting the benefits of marriage on cardiovascular health, the impact of marital/partner status on the long-term readmission of young acute myocardial infarction (AMI) survivors is less clear. We examined the association between marital/partner status and 1-year all-cause readmission and explored sex differences amo...
When the distributions of treatment effect modifiers differ between a randomized trial and an external target population, the sample average treatment effect in the trial may be substantially different from the target population average treatment, and accurate estimation of the latter requires adjusting for the differential distribution of effect m...
Stepped wedge design is a popular research design that enables a rigorous evaluation of candidate interventions by using a staggered cluster randomization strategy. While analytical methods were developed for designing stepped wedge trials, the prior focus has been solely on testing for the average treatment effect. With a growing interest on forma...
The two‐stage preference design (TSPD) enables inference for treatment efficacy while allowing for incorporation of patient preference to treatment. It can provide unbiased estimates for selection and preference effects, where a selection effect occurs when patients who prefer one treatment respond differently than those who prefer another, and a p...
Individually randomized group treatment (IRGT) trials, in which the clustering of outcome is induced by group‐based treatment delivery, are increasingly popular in public health research. IRGT trials frequently incorporate longitudinal measurements, of which the proper sample size calculations should account for correlation structures reflecting bo...
Background/Aims
The stepped-wedge cluster randomized trial (SW-CRT), in which clusters are randomized to a time at which they will transition to the intervention condition – rather than a trial arm – is a relatively new design. SW-CRTs have additional design and analytical considerations compared to conventional parallel arm trials. To inform futur...
In many medical studies, the outcome measure (such as quality of life, QOL) for some study participants becomes informatively truncated (censored, missing, or unobserved) due to death or other forms of dropout, creating a nonignorable missing data problem. In such cases, the use of a composite outcome or imputation methods that fill in unmeasurable...
Objectives
Propensity score (PS) weighting methods are commonly used to adjust for confounding in observational treatment comparisons. However, in the setting of substantial covariate imbalance, PS values may approach 0 and 1, yielding extreme weights and inflated variance of the estimated treatment effect. Adaptations of the standard inverse proba...
Cluster-randomized trials (CRTs) often allocate intact clusters of participants to treatment or control conditions and are increasingly used to evaluate healthcare delivery interventions. While previous studies have developed sample size methods for testing confirmatory hypotheses of treatment effect heterogeneity in CRTs (i.e., targeting the diffe...
Background
The timely identification of aortic stenosis (AS) and disease stage that merits intervention requires frequent echocardiography. However, there is no strategy to personalize the frequency of monitoring needed.
Objectives
To explore the role of AI-enhanced two-dimensional-echocardiography in stratifying the risk of AS development and pro...
In an individually randomized group treatment (IRGT) trial, participant outcomes can be positively correlated due to, for example, shared therapists in treatment delivery. Oftentimes, because of limited treatment resources or participants at one location, an IRGT trial can be carried out across multiple centers. This design can be subject to potent...
Cluster randomized trials (CRTs) refer to a popular class of experiments in which randomization is carried out at the group level. While methods have been developed for planning CRTs to study the average treatment effect, and more recently, to study the heterogeneous treatment effect, the development for the latter objective has currently been limi...
A population‐averaged additive subdistribution hazards model is proposed to assess the marginal effects of covariates on the cumulative incidence function and to analyze correlated failure time data subject to competing risks. This approach extends the population‐averaged additive hazards model by accommodating potentially dependent censoring due t...
Background
Stress experienced in a marriage or committed relationship may be associated with worse patient‐reported outcomes after acute myocardial infarction (AMI), but little is known about this association in young adults (≤55 years) with AMI.
Methods and Results
We used data from VIRGO (Variation in Recovery: Role of Gender on Outcomes of Youn...
Linear mixed models are commonly used in analyzing stepped-wedge cluster randomized trials (SW-CRTs). A key consideration for analyzing a SW-CRT is accounting for the potentially complex correlation structure, which can be achieved by specifying a random effects structure. Common random effects structures for a SW-CRT include random intercept, rand...
Background:
Recent work has shown that cluster-randomised trials can estimate two distinct estimands: the participant-average and cluster-average treatment effects. These can differ when participant outcomes or the treatment effect depends on the cluster size (termed informative cluster size). In this case, estimators that target one estimand (suc...
Introduction: Despite evidence supporting the benefits of marriage on cardiovascular health, the impact of marital/partner status on the long-term readmission of young acute myocardial infarction (AMI) survivors is less clear. We aimed to examine the association between marital/partner status and 1-year all-cause readmission, and explore sex differ...
Assessing heterogeneity in the effects of treatments has become increasingly popular in the field of causal inference and carries important implications for clinical decision-making. While extensive literature exists for studying treatment effect heterogeneity when outcomes are fully observed, there has been limited development in tools for estimat...
Cluster randomized trials (CRTs) are studies where treatment is randomized at the cluster level but outcomes are typically collected at the individual level. When CRTs are employed in pragmatic settings, baseline population characteristics may moderate treatment effects, leading to what is known as heterogeneous treatment effects (HTEs). Pre-specif...
Stepped wedge cluster randomized experiments represent a class of unidirectional crossover designs that are increasingly adopted for comparative effectiveness and implementation science research. Although stepped wedge cluster randomized experiments have become popular, definitions of estimands and robust methods to target clearly-defined estimands...
An important consideration in the design and analysis of randomized trials is the need to account for outcome observations being positively correlated within groups or clusters. Two notable types of designs with this consideration are individually randomized group treatment trials and cluster randomized trials. While sample size methods for testing...
Acute kidney injury is common among hospitalized individuals, particularly those exposed to certain medications, and is associated with substantial morbidity and mortality. In a pragmatic, open-label, National Institutes of Health-funded, parallel group randomized controlled trial (clinicaltrials.gov NCT02771977), we investigate whether an automate...
It is well-known that designing a cluster randomized trial (CRT) requires an advance estimate of the intra-cluster correlation coefficient (ICC). In the case of longitudinal CRTs, where outcomes are assessed repeatedly in each cluster over time, estimates for more complex correlation structures are required. Three common types of correlation struct...
Background:
Targeting short-term improvements in multicomponent risk scores for mortality in patients with pulmonary arterial hypertension (PAH) could result in improved long-term outcomes. We aimed to determine whether PAH risk scores were adequate surrogates for clinical worsening or mortality outcomes in PAH randomised clinical trials (RCTs)....
The currently recommended dose of dexamethasone for patients with severe or critical COVID-19 is 6 mg per day (mg/d) regardless of patient features and variation. However, patients with severe or critical COVID-19 are heterogenous in many ways (e.g., age, weight, comorbidities, disease severity, and immune features). Thus, it is conceivable that a...
Cluster-randomized trials (CRTs) involve randomizing entire groups of participants-called clusters-to treatment arms but are often comprised of a limited or fixed number of available clusters. While covariate adjustment can account for chance imbalances between treatment arms and increase statistical efficiency in individually randomized trials, an...
Causal mediation analysis is widely used in health science research to evaluate the extent to which an intermediate variable explains an observed exposure-outcome relationship. However, the validity of analysis can be compromised when the exposure is measured with error, which is common in health science studies. This article investigates the impac...
Background
Detecting treatment effect heterogeneity is an important objective in cluster randomized trials and implementation research. While sample size procedures for testing the average treatment effect accounting for participant attrition assuming missing completely at random or missing at random have been previously developed, the impact of at...
While the inverse probability of treatment weighting (IPTW) is a commonly used approach for treatment comparisons in observational data, the resulting estimates may be subject to bias and excessively large variance when there is lack of overlap in the propensity score distributions. By smoothly down-weighting the units with extreme propensity score...
Background and objectives:
Marginal models with generalized estimating equations (GEE) are usually recommended for analyzing correlated ordinal outcomes which are commonly seen in a longitudinal study or clustered randomized trial (CRT). Within-cluster association is often of interest in longitudinal studies or CRTs, and can be estimated with pair...
Estimands can help clarify the interpretation of treatment effects and ensure that estimators are aligned to the study's objectives. Cluster randomised trials require additional attributes to be defined within the estimand compared to individually randomised trials, including whether treatment effects are marginal or cluster specific, and whether t...
Background
The effectiveness of malaria vector control interventions is often evaluated using cluster randomized trials (CRT) with outcomes assessed using repeated cross-sectional surveys. A key requirement for appropriate design and analysis of longitudinal CRTs is accounting for the intra-cluster correlation coefficient (ICC). In addition to exch...
Background and objectives:
Generalized estimating equations (GEE) are used to analyze correlated outcomes in marginal regression models with population-averaged interpretations of exposure effects. Limitations of popular software for GEE include: (i) user choice is restricted to a small set of within-cluster pairwise correlation (intra-class corre...
The difference method is used in mediation analysis to quantify the extent to which a mediator explains the mechanisms underlying the pathway between an exposure and an outcome. In many health science studies, the exposures are almost never measured without error, which can result in biased effect estimates. This article investigates methods for me...
Objectives:
In stepped-wedge cluster randomized trials (SW-CRTs), clusters are randomized not to treatment and control arms, but to sequences dictating the times of crossing from control to intervention conditions. Randomization is an essential feature of this design but application of standard methods to promote and report on balance at baseline...
Background: Marital stress is associated with worse cardiac outcomes in young adults (≤55 years) with acute myocardial infarction (AMI), but whether psychosocial factors mediate this association remains largely unknown. We conducted a mediation analysis to investigate whether marital stress worsened quality of life (QoL) after AMI by increasing the...
Many studies encounter clustering due to multicenter enrollment and non-mortality outcomes, such as quality-of-life, that are truncated due to death; i.e., missing not at random and nonignorable. Traditional missing data methods and target causal estimands are suboptimal for statistical inference in the presence of these combined issues, which are...
Cluster randomized trials (CRTs) are studies where treatment is randomized at the cluster level but outcomes are typically collected at the individual level. When CRTs are employed in pragmatic settings, baseline population characteristics may moderate treatment effects, leading to what is known as heterogeneous treatment effects (HTEs). Pre-specif...
Most reported treatment effects in medical research studies are ambiguously defined, which can lead to misinterpretation of study results. This is because most studies do not attempt to describe what the treatment effect represents, and instead require readers to deduce this based on the reported statistical methods. However, this approach is fraug...
Introduction:
To improve dementia care delivery for persons across all backgrounds, it is imperative that health equity is integrated into pragmatic trials.
Methods:
We reviewed 62 pragmatic trials of people with dementia published 2014 to 2019. We assessed health equity in the objectives; design, conduct, analysis; and reporting using PROGRESS-...
Background:
Treatment of severe inpatient hypertension (HTN) that develops during hospitalization is not informed by guidelines. Intravenous (i.v.) antihypertensives are used to manage severe HTN even in the absence of acute target organ damage; however they may result in unpredictable blood pressure (BP) reduction and cardiovascular events. Our g...
Stepped wedge cluster randomized trials (SW-CRTs) are increasingly being used to evaluate interventions in medical, public health, educational, and social science contexts. With the longitudinal and crossover natures of an SW-CRT, complex analysis techniques are often needed, which makes appropriately powering SW-CRTs challenging. In this article,...
Multivariate outcomes are common in pragmatic cluster randomized trials. While sample size calculation procedures for multivariate outcomes exist under parallel assignment, none have been developed for a stepped wedge design. In this article, we present computationally efficient power and sample size procedures for stepped wedge cluster randomized...
Cluster randomized trials (CRTs) frequently recruit a small number of clusters, therefore necessitating the application of small‐sample corrections for valid inference. A recent systematic review indicated that CRTs reporting right‐censored, time‐to‐event outcomes are not uncommon and that the marginal Cox proportional hazards model is one of the c...
Simulation studies play an important role in evaluating the performance of statistical models developed for analyzing complex survival data such as those with competing risks and clustering. This article aims to provide researchers with a basic understanding of competing risks data generation, techniques for inducing cluster-level correlation, and...
Background: Psychosocial stress is associated with worse cardiac outcomes, but little is known about the prognostic impact of marital stress in young adults (≤55 years) with acute myocardial infarction (AMI). We investigated the association between marital stress and 1-year health outcomes in young AMI survivors.
Methods: We used data from the VIRG...
A stepped wedge cluster randomized trial is a unidirectional crossover study in which timings of treatment initiation for clusters are randomized. Because the timing of treatment initiation is different for each cluster, an emerging question is whether the treatment effect depends on the exposure time, namely, the time duration since the initiation...
Stepped wedge designs have uni-directional crossovers at randomly assigned time points (steps) where clusters switch from control to intervention condition. Incomplete stepped wedge designs are increasingly used in cluster randomized trials of health care interventions and have periods without data collection due to logistical, resource and patient...