Article

Applying Quantitative Approaches in the Use of RWE in Clinical Development and Life-Cycle Management

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Randomized controlled clinical trials (RCTs) have been the gold standard for the evaluation of efficacy and safety of medical interventions. However, the costs, duration, practicality, and limited generalizability have incentivized many to look for alternatives. In recent years, we have seen an increasing usage of real-world data (RWD) and real-world evidence (RWE) in clinical development and life-cycle management, although many challenges remain. While there are numerous publications in RWD and RWE areas, strategic planning and tactical execution perspectives from an end-to-end process are still lacking, including the use of RWEs not only in regulatory settings but also in non-regulatory settings along with organizational infrastructure considerations. We attempt, to the extent possible, to fill this void by providing thoughts on addressing the key challenges and use cases we have seen in Real-world (RW) settings. As quantitative scientists working in drug development, we see tremendous potentials in applying quantitative approaches in the use of RWE. To that end, we include discussions on opportunities where statisticians could play a key role in RWE research both in and beyond regulatory settings.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Chapter
The use of real-world data (RWD) and evidence (RWE) in clinical development to support regulatory decisions has gained increasing momentum in recent years. With the release of US Food and Drug Administration (FDA) draft guidance on Assessing Electronic Health Records (EHR) and Medical Claims Data and Registries to support regulatory decisions-making for drug and biological products in September and November 2021, respectively, and several white papers from Duke Margolis in 2018 and 2019, much inroad has been made in the understanding of how fit-for-use RWD sources could be assessed. Further, the ASA Biopharmaceutical Section-(BIOP)sponsored RWE Scientific Working Group (SWG) developed a semi-quantitative approach in the assessment of fit-for-use RWD sources. In this chapter, we will first review the guiding principles as outlined in the various literature. We will then provide a review and summary of RWD sources and type of settings where RWD may be used more effectively and fit-for-use for regulatory purposes. Using the semi-quantitative approach proposed by the RWE SWG, we will illustrate the actual assessment applications via an example from an RWD source.KeywordsReal-world dataReal-world evidenceFit-for-useRandomized controlled trialsDUPLICATEConcertAIEstimandData relevancyData reliability
Article
A Real-World Evidence (RWE) scientific working group of the American Statistical Association Biopharmaceutical Section has been reviewing the statistical considerations for the generation of real-world evidence to support regulatory decision making. As part of the effort, the working group is addressing the fitness-for-use of real-world data (RWD). RWD may be used in a variety of ways and study designs including in randomized studies, externally controlled studies, and purely observational studies. The use of RWD poses unique issues surrounding study integrity, transparency, and reproducibility. Rule-based methods and machine learning approaches can be used to extract key data elements from RWD sources. In some cases, multiple sources of data may be linked to obtain the necessary study data. Missing data may have unique considerations in the RWD sources, since data elements are collected for the practice of medicine and are not protocol driven. Lack or imperfect capture of some information in an RWD source may lead to multiple biases that threaten the fitness-for-use of an RWD source, including information bias, selection bias, and confounding. Validation studies and quantitative bias assessment can be used to assess the potential bias. The working group proposes a data-driven approach framework for determining the fit-for-use of RWD.
Article
Real-world data (RWD) is playing an increasingly important role in drug development from early discovery throughout the life-cycle management. This includes leveraging RWD in randomized clinical trial (RCT) design and study conduct. In many scenarios, a concurrent control arm may not be viable for ethical or practical considerations, and inclusion of an external control arm can greatly facilitate the decision-making and interpretation of findings. We summarize the strengths and limitations of typical external data sources including historical RCT, aggregated data at study level from literature, patient registry, health insurance claims, electronic health records in terms of fit-for-purpose data selection. To address the inherent confounding due to lack of randomization, propensity score matching method has the advantages of separating the design from analysis and providing the ability to explicitly examine the degree of overlap in confounders. Within the framework of causal inference, however, many alternatives have been proposed with desirable theoretical properties. In this article, we review key steps from study design conceptualization to data source selection, and focus on several methods for evaluation of performance in the context of creating external control for clinical trials. We conducted a focused simulation studies to assess bias reduction and statistical properties when underlying assumptions are violated or models are mis-specified. The results support that analysis using matched group improve bias reduction when sample size is not a limiting factor, and targeted maximum likelihood estimation coupled with super learner is robust when estimating both average treatment effects and average treatment effects among treated.
Article
Full-text available
Background: Regulators are evaluating the use of non-interventional real-world evidence (RWE) studies to assess the effectiveness of medical products. The RCT-DUPLICATE initiative uses a structured process to design RWE studies emulating randomized controlled trials (RCTs) and compare results. Here, we report findings of the first 10 trial emulations, evaluating cardiovascular outcomes of antidiabetic or antiplatelet medications. Methods: We selected 3 active-controlled and 7 placebo-controlled RCTs for replication. Using patient-level claims data from US commercial and Medicare payers, we implemented inclusion/exclusion criteria, selected primary endpoints, and comparator populations to emulate those of each corresponding RCT. Within the trial-mimicking populations, we conducted propensity score matching to control for >120 pre-exposure confounders. All study parameters were prospectively defined and protocols registered before hazard ratios (HRs) and 95% confidence intervals (CIs) were computed. Success criteria for the primary analysis were pre-specified for each replication. Results: Despite attempts to emulate RCT design as closely as possible, differences between the RCT and corresponding RWE study populations remained. The regulatory conclusions were equivalent in 6 of 10. The RWE emulations achieved a HR estimate that was within the 95% CI from the corresponding RCT in 8 of 10 studies. In 9 of 10, either the regulatory or estimate agreement success criteria were fulfilled. The largest differences in effect estimates were found for RCTs where second-generation sulfonylureas were used as a proxy for placebo regarding cardiovascular effects. Nine of 10 replications had a standardized difference between effect estimates of <2, which suggests differences within expected random variation. Conclusions: Agreement between RCT and RWE findings varies depending on which agreement metric is used. Interim findings indicate that selection of active comparator therapies with similar indications and use patterns enhances the validity of RWE. Even in the context of active comparators, concordance between RCT and RWE findings is not guaranteed, partially because trials are not emulated exactly. More trial emulations are needed to understand how often and in what contexts RWE findings match RCTs. Clinical Trial Registration: URL: https://clinicaltrials.gov Unique Identifiers: NCT03936049, NCT04215523, NCT04215536, NCT03936010, NCT03936036, NCT03936062, NCT03936023, NCT03648424, NCT04237935, NCT04237922
Article
Full-text available
Objectives: Real-world evidence (RWE) has gained increased attention in recent years as a complement to traditional clinical trials. The use of RWE to establish the efficacy of oncology drugs for Food and Drug Administration (FDA) approval has not been described. In this paper, we review 5 recent examples where RWE was submitted in support of the FDA approvals of original or supplementary indications for oncology drugs. Methods: To identify cases where RWE was used, we reviewed drug approval packages available at Drugs@FDA for oncology drugs approved between 2017 and 2019. Five cases were selected to present a broad overview of different types of RWE, different circumstances under which RWE has been used for regulatory approvals, and how FDA evaluated the data in each case. The type of RWE submitted, the indication, limitations identified by FDA reviewers, and the outcome of the submission are discussed. Results: RWE, particularly historical controls for rare or orphan indications, has been used to support both original and supplementary oncology drug approvals. Types of RWE included data from electronic health records, claims, post-marketing safety reports, retrospective medical record reviews, and expanded access studies. Small sample sizes, data quality, and methodological issues were among concerns cited by FDA reviewers. Conclusion: By bridging the gap between the constraints of the trial setting and the realities of clinical practice, RWE can add value to a regulatory submission. These early examples provide insight into how regulators evaluated RWE submitted as evidence of efficacy for oncology drugs.
Article
Full-text available
There is growing interest globally in using real-world data (RWD) and real-world evidence (RWE) for health technology assessment (HTA). Optimal collection, analysis, and use of RWD/RWE to inform HTA requires a conceptual framework to standardize processes and ensure consistency. However, such framework is currently lacking in Asia, a region that is likely to benefit from RWD/RWE for at least two reasons. First, there is often limited Asian representation in clinical trials unless specifically conducted in Asian populations, and RWD may help to fill the evidence gap. Second, in a few Asian health systems, reimbursement decisions are not made at market entry; thus, allowing RWD/RWE to be collected to give more certainty about the effectiveness of technologies in the local setting and inform their appropriate use. Furthermore, an alignment of RWD/RWE policies across Asia would equip decision makers with context-relevant evidence, and improve timely patient access to new technologies. Using data collected from eleven health systems in Asia, this paper provides a review of the current landscape of RWD/RWE in Asia to inform HTA and explores a way forward to align policies within the region. This paper concludes with a proposal to establish an international collaboration among academics and HTA agencies in the region: the REAL World Data In ASia for HEalth Technology Assessment in Reimbursement (REALISE) working group, which seeks to develop a non-binding guidance document on the use of RWD/RWE to inform HTA for decision making in Asia.
Article
Full-text available
Purpose For this article, the authors compiled, summarized, and analyzed data from 27 cases in which real-world data (RWD) were applied in regulatory approval. The aims were to provide an overview of RWD, based on classifications per therapeutic area, age group, drivers of acceptability, utility, data sources, and timelines, and to present insights on how it has been applied in regulatory decision making to date. Methods Clarivate Analytics was commissioned to collect data from cases in which RWD was used for new drug applications and line extensions submitted to the European Medicines Agency (EMA), the US Food and Drug Administration (FDA), Health Canada, and Japan's Pharmaceuticals and Medical Devices Agency. The query resulted in 27 cases in which regulatory approval was associated with RWD. The data were then categorized and elaborated with supporting information gathered from public databases and company websites. Findings There were 17 identified cases in which RWD were used for new drug applications, and 10 for line extensions, between the years 1998 and 2019. Approvals were spread across regulatory bodies: the EMA alone (6 cases), the FDA alone (4 cases), or jointly between the EMA and FDA or other regulatory bodies. The applications were also distributed across age groups and therapeutic areas but were mostly applied in oncology and metabolism. The new drug applications of all 17 products were approved, with drugs from new drug applications initially marketed as orphan drugs. In most cases, RWD were used either as primary data, when noncomparative data were available to demonstrate tolerability and efficacy, or as supportive data when validating findings. Common sources of RWD have been health or medical records (16 cases) and registries (8 cases). Review timelines in which RWD were applied were than 1 year for new drug applications and between 3 and 10 months for line extensions. Implications The analysis of this study was limited in that the data were gathered from the commissioned query and may therefore have been nonexhaustive. Nonetheless, we recognize that the use of RWD has been gaining attention across the community and is expected to expand as a result of the various initiatives and efforts carried out in the sector. While the current application of RWD has been limited to specific cases, there is a potential to further explore and develop its application. Further refinements in the analytical processes, methodologies, and techniques would need to be established to achieve similar effects observed in randomized controlled trials (Clin Ther. 2020; 42:XXX–XXX) © 2020 Elsevier Inc.
Article
Full-text available
Randomized clinical trials (RCTs) are the gold standard in producing clinical evidence of efficacy and safety of medical interventions. More recently, a new paradigm is emerging—specifically within the context of preauthorization regulatory decision‐making—for some novel uses of real‐world evidence (RWE) from a variety of real‐world data (RWD) sources to answer certain clinical questions. Traditionally reserved for rare diseases and other special circumstances, external controls (eg, historical controls) are recognized as a possible type of control arm for single‐arm trials. However, creating and analyzing an external control arm using RWD can be challenging since design and analytics may not fully control for all systematic differences (biases). Nonetheless, certain biases can be attenuated using appropriate design and analytical approaches. The main objective of this paper is to improve the scientific rigor in the generation of external control arms using RWD. Here we (a) discuss the rationale and regulatory circumstances appropriate for external control arms, (b) define different types of external control arms, and (c) describe study design elements and approaches to mitigate certain biases in external control arms. This manuscript received endorsement from the International Society for Pharmacoepidemiology (ISPE).
Article
Full-text available
Purpose: There is a need to develop hybrid trial methodology combining the best parts of traditional randomized controlled trials (RCTs) and observational study designs to produce real-world evidence (RWE) that provides adequate scientific evidence for regulatory decision-making. Methods: This review explores how hybrid study designs that include features of RCTs and studies with real-world data (RWD) can combine the advantages of both to generate RWE that is fit for regulatory purposes. Results: Some hybrid designs include randomization and use pragmatic outcomes; other designs use single-arm trial data supplemented with external comparators derived from RWD or leverage novel data collection approaches to capture long-term outcomes in a real-world setting. Some of these approaches have already been successfully used in regulatory decisions, raising the possibility that studies using RWD could increasingly be used to augment or replace traditional RCTs for the demonstration of drug effectiveness in certain contexts. These changes come against a background of long reliance on RCTs for regulatory decision-making, which are labor-intensive, costly, and produce data that can have limited applicability in real-world clinical practice. Conclusions: While RWE from observational studies is well accepted for satisfying postapproval safety monitoring requirements, it has not commonly been used to demonstrate drug effectiveness for regulatory purposes. However, this position is changing as regulatory opinions, guidance frameworks, and RWD methodologies are evolving, with growing recognition of the value of using RWE that is acceptable for regulatory decision-making.
Article
Full-text available
Recent legislation mandates that the US FDA issue guidance regarding when real‐world evidence (RWE) could be used to support regulatory decision‐making. Although RWE could come from randomized or nonrandomized designs, there are significant concerns about the validity of RWE assessing medication effectiveness based on nonrandomized designs. We propose an initiative using healthcare claims data to assess the ability of nonrandomized RWE to provide results that are comparable to those from randomized controlled trials (RCTs). We selected 40 RCTs, and we estimate that approximately 30 attempted replications will be completed after feasibility analyses. We designed an implementation process to ensure that each attempted replication is consistent, transparent, and reproducible. This initiative is the first to systematically evaluate the ability of nonrandomized RWE to replicate multiple RCTs using a structured process. Results from this study should provide insight on the strengths and limitations of using nonrandomized RWE from claims for regulatory decision‐making.
Article
Full-text available
Purpose: This pilot study examined the ability to operationalize the collection of real-world data to explore the potential use of real-world end points extracted from data from diverse health care data organizations and to assess how these relate to similar end points in clinical trials for immunotherapy-treated advanced non-small-cell lung cancer. Patients and methods: Researchers from six organizations followed a common protocol using data from administrative claims and electronic health records to assess real-world end points, including overall survival (rwOS), time to next treatment, time to treatment discontinuation (rwTTD), time to progression, and progression-free survival, among patients with advanced non-small-cell lung cancer treated with programmed death 1/programmed death-ligand 1 inhibitors in real-world settings. Data sets included from 269 to 6,924 patients who were treated between January 2011 and October 2017. Results from contributors were anonymized. Results: Correlations between real-world intermediate end points (rwTTD and time to next treatment) and rwOS were moderate to high (range, 0.6 to 0.9). rwTTD was the most consistent end points as treatment detail was available in all data sets. rwOS at 1 year post-programmed death-ligand 1 initiation ranged from 40% to 57%. In addition, rwOS as assessed via electronic health records and claims data fell within the range of median OS values observed in relevant clinical trials. Data sources had been used extensively for research with ongoing data curation to assure accuracy and practical completeness before the initiation of this research. Conclusion: These findings demonstrate that real-world end points are generally consistent with each other and with outcomes observed in randomized clinical trials, which substantiates the potential validity of real-world data to support regulatory and payer decision making. Differences observed likely reflect true differences between real-world and protocol-driven practices.
Article
Full-text available
Background: Non-alcoholic fatty liver disease (NAFLD) is the most common cause of liver disease worldwide. It affects an estimated 20% of the general population, based on cohort studies of varying size and heterogeneous selection. However, the prevalence and incidence of recorded NAFLD diagnoses in unselected real-world health-care records is unknown. We harmonised health records from four major European territories and assessed age- and sex-specific point prevalence and incidence of NAFLD over the past decade. Methods: Data were extracted from The Health Improvement Network (UK), Health Search Database (Italy), Information System for Research in Primary Care (Spain) and Integrated Primary Care Information (Netherlands). Each database uses a different coding system. Prevalence and incidence estimates were pooled across databases by random-effects meta-analysis after a log-transformation. Results: Data were available for 17,669,973 adults, of which 176,114 had a recorded diagnosis of NAFLD. Pooled prevalence trebled from 0.60% in 2007 (95% confidence interval: 0.41-0.79) to 1.85% (0.91-2.79) in 2014. Incidence doubled from 1.32 (0.83-1.82) to 2.35 (1.29-3.40) per 1000 person-years. The FIB-4 non-invasive estimate of liver fibrosis could be calculated in 40.6% of patients, of whom 29.6-35.7% had indeterminate or high-risk scores. Conclusions: In the largest primary-care record study of its kind to date, rates of recorded NAFLD are much lower than expected suggesting under-diagnosis and under-recording. Despite this, we have identified rising incidence and prevalence of the diagnosis. Improved recognition of NAFLD may identify people who will benefit from risk factor modification or emerging therapies to prevent progression to cardiometabolic and hepatic complications.
Article
Full-text available
Background: Reimbursement decisions are conventionally based on evidence from randomised controlled trials (RCTs), which often have high internal validity but low external validity. Real-world data (RWD) may provide complimentary evidence for relative effectiveness assessments (REAs) and cost-effectiveness assessments (CEAs). This study examines whether RWD is incorporated in health technology assessment (HTA) of melanoma drugs by European HTA agencies, as well as differences in RWD use between agencies and across time. Methods: HTA reports published between 1 January 2011 and 31 December 2016 were retrieved from websites of agencies representing five jurisdictions: England [National Institute for Health and Care Excellence (NICE)], Scotland [Scottish Medicines Consortium (SMC)], France [Haute Autorité de santé (HAS)], Germany [Institute for Quality and Efficacy in Healthcare (IQWiG)] and The Netherlands [Zorginstituut Nederland (ZIN)]. A standardized data extraction form was used to extract information on RWD inclusion for both REAs and CEAs. Results: Overall, 52 reports were retrieved, all of which contained REAs; CEAs were present in 25 of the reports. RWD was included in 28 of the 52 REAs (54%), mainly to estimate melanoma prevalence, and in 22 of the 25 (88%) CEAs, mainly to extrapolate long-term effectiveness and/or identify drug-related costs. Differences emerged between agencies regarding RWD use in REAs; the ZIN and IQWiG cited RWD for evidence on prevalence, whereas the NICE, SMC and HAS additionally cited RWD use for drug effectiveness. No visible trend for RWD use in REAs and CEAs over time was observed. Conclusion: In general, RWD inclusion was higher in CEAs than REAs, and was mostly used to estimate melanoma prevalence in REAs or to predict long-term effectiveness in CEAs. Differences emerged between agencies' use of RWD; however, no visible trends for RWD use over time were observed.
Article
Full-text available
Purpose: Real-world evidence (RWE) includes data from retrospective or prospective observational studies and observational registries and provides insights beyond those addressed by randomized controlled trials. RWE studies aim to improve health care decision making. Methods: The International Society for Pharmacoeconomics and Outcomes Research (ISPOR) and the International Society for Pharmacoepidemiology (ISPE) created a task force to make recommendations regarding good procedural practices that would enhance decision makers' confidence in evidence derived from RWD studies. Peer review by ISPOR/ISPE members and task force participants provided a consensus-building iterative process for the topics and framing of recommendations. Results: The ISPOR/ISPE Task Force recommendations cover seven topics such as study registration, replicability, and stakeholder involvement in RWE studies. These recommendations, in concert with earlier recommendations about study methodology, provide a trustworthy foundation for the expanded use of RWE in health care decision making. Conclusion: The focus of these recommendations is good procedural practices for studies that test a specific hypothesis in a specific population. We recognize that some of the recommendations in this report may not be widely adopted without appropriate incentives from decision makers, journal editors, and other key stakeholders.
Article
Full-text available
Purpose: Defining a study population and creating an analytic dataset from longitudinal healthcare databases involves many decisions. Our objective was to catalogue scientific decisions underpinning study execution that should be reported to facilitate replication and enable assessment of validity of studies conducted in large healthcare databases. Methods: We reviewed key investigator decisions required to operate a sample of macros and software tools designed to create and analyze analytic cohorts from longitudinal streams of healthcare data. A panel of academic, regulatory, and industry experts in healthcare database analytics discussed and added to this list. Conclusion: Evidence generated from large healthcare encounter and reimbursement databases is increasingly being sought by decision-makers. Varied terminology is used around the world for the same concepts. Agreeing on terminology and which parameters from a large catalogue are the most essential to report for replicable research would improve transparency and facilitate assessment of validity. At a minimum, reporting for a database study should provide clarity regarding operational definitions for key temporal anchors and their relation to each other when creating the analytic dataset, accompanied by an attrition table and a design diagram. A substantial improvement in reproducibility, rigor and confidence in real world evidence generated from healthcare databases could be achieved with greater transparency about operational study parameters used to create analytic datasets from longitudinal healthcare databases.
Article
Full-text available
Regulators consider randomized controlled trials (RCTs) as the gold standard for evaluating the safety and effectiveness of medications, but their costs, duration, and limited generalizability have caused some to look for alternatives. Real world evidence based on data collected outside of RCTs, such as registries and longitudinal healthcare databases, can sometimes substitute for RCTs, but concerns about validity have limited their impact. Greater reliance on such real world data (RWD) in regulatory decision-making requires understanding why some studies fail while others succeed in producing results similar to RCTs. Key questions when considering whether RWD analyses can substitute for RCTs for regulatory decision-making are WHEN one can study drug effects without randomization and HOW to implement a valid RWD analysis if one has decided to pursue that option. The WHEN is primarily driven by externalities not controlled by investigators, while the HOW is focused on avoiding known mistakes in RWD analyses. This article is protected by copyright. All rights reserved.
Article
Full-text available
Coronary artery disease (CAD) is the underlying cause of death in one out of seven deaths in the USA. Aspirin therapy has been proven to decrease mortality and major adverse cardiovascular events in patients with CAD. Despite a plethora of studies showing the benefit of aspirin in secondary prevention of cardiovascular events, debate remains regarding the optimal dose due to relatively small studies that had disparate results when comparing patients taking different aspirin dosages. More recently, aspirin dosing has been thoroughly studied in the CAD population with concomitant therapy (such as P2Y12 inhibitors); however, patients in these studies were not randomized to aspirin dose. No randomized controlled trial has directly measured aspirin dosages in a population of patients with established coronary artery disease. In 2015, the Patient-Centered Outcomes Research Institute (PCORI) developed a network, called PCORnet, that includes patient-powered research networks (PPRN) and clinical data research networks (CDRN). The main objective of PCORnet is to conduct widely generalizable observational studies and clinical trials (including large, pragmatic clinical trials) at a low cost. The first clinical trial, called Aspirin Dosing: A Patient-centric Trial Assessing Benefits and Long-term Effectiveness (ADAPTABLE), will randomly assign 20,000 subjects with established coronary heart disease to either low dose (81 mg) or high dose (325 mg) and should be able to finally answer which dosage of aspirin is best for patients with established cardiovascular disease.
Article
Full-text available
PRECIS is a tool to help trialists make design decisions consistent with the intended purpose of their trial. This paper gives guidance on how to use an improved, validated version, PRECIS-2, which has been developed with the help of over 80 international trialists, clinicians, and policymakers. Keeping the original simple wheel format, PRECIS-2 has nine domains-eligibility criteria, recruitment, setting, organisation, flexibility (delivery), flexibility (adherence), follow-up, primary outcome, and primary analysis-scored from 1 (very explanatory) to 5 (very pragmatic) to facilitate domain discussion and consensus. It is hoped PRECIS-2 will be valuable in supporting the explicit matching of design decisions to how the trial results are intended to be used
Article
Full-text available
Randomized controlled trials have traditionally been the gold standard against which all other sources of clinical evidence are measured. However, the cost of conducting these trials can be prohibitive. In addition, evidence from the trials frequently rests on narrow patient-inclusion criteria and thus may not generalize well to real clinical situations. Given the increasing availability of comprehensive clinical data in electronic health records (EHRs), some health system leaders are now advocating for a shift away from traditional trials and toward large-scale retrospective studies, which can use practice-based evidence that is generated as a by-product of clinical processes. Other thought leaders in clinical research suggest that EHRs should be used to lower the cost of trials by integrating point-of-care randomization and data capture into clinical processes. We believe that a successful learning health care system will require both approaches, and we suggest a model that resolves this escalating tension: a "green button" function within EHRs to help clinicians leverage aggregate patient data for decision making at the point of care. Giving clinicians such a tool would support patient care decisions in the absence of gold-standard evidence and would help prioritize clinical questions for which EHR-enabled randomization should be carried out. The privacy rule in the Health Insurance Portability and Accountability Act (HIPAA) of 1996 may require revision to support this novel use of patient data.
Article
Full-text available
The propensity score is the probability of treatment assignment conditional on observed baseline characteristics. The propensity score allows one to design and analyze an observational (nonrandomized) study so that it mimics some of the particular characteristics of a randomized controlled trial. In particular, the propensity score is a balancing score: conditional on the propensity score, the distribution of observed baseline covariates will be similar between treated and untreated subjects. I describe 4 different propensity score methods: matching on the propensity score, stratification on the propensity score, inverse probability of treatment weighting using the propensity score, and covariate adjustment using the propensity score. I describe balance diagnostics for examining whether the propensity score model has been adequately specified. Furthermore, I discuss differences between regression-based methods and propensity score-based methods for the analysis of observational data. I describe different causal average treatment effects and their relationship with propensity score analyses.
Article
Full-text available
For obtaining causal inferences that are objective, and therefore have the best chance of revealing scientific truths, carefully designed and executed randomized experiments are generally considered to be the gold standard. Observational studies, in contrast, are generally fraught with problems that compromise any claim for objectivity of the resulting causal inferences. The thesis here is that observational studies have to be carefully designed to approximate randomized experiments, in particular, without examining any final outcome data. Often a candidate data set will have to be rejected as inadequate because of lack of data on key covariates, or because of lack of overlap in the distributions of key covariates between treatment and control groups, often revealed by careful propensity score analyses. Sometimes the template for the approximating randomized experiment will have to be altered, and the use of principal stratification can be helpful in doing this. These issues are discussed and illustrated using the framework of potential outcomes to define causal effects, which greatly clarifies critical issues.
Article
Real-world evidence (RWE), derived from Data from “real-world” clinical practice and medical product utilization, is an increasingly important source of evidence that holds great potential to increase efficiency and improve clinical development and life cycle management of medical products. Regulatory agencies, public-private partnerships and health technology assessment organizations have launched major initiatives and released guidance to address considerations in the use of RWE to inform regulatory decision making. However, many challenges remain on how RWE could be best utilized for various types of regulatory decisions from statistical perspectives. To address the relevant statistical challenges, a working group under the auspices of the ASA Biopharmaceutical Section was established. This article reviews the biostatistical challenges and methods for the use of RWE for medical product development. There are two companion papers as the output from the same working group that address focused topics. The paper by Chen et al. (2020) provides the current landscape on the use of RWE to inform clinical study design and analysis, and the paper by Ho et al. (2020) presents a review of causal inference framework for design and analysis of studies using RWE.
Article
Real-World Data (RWD), such as electronic health records (EHRs), reimbursement requests as adjudicated by health insurance companies, and health survey data as collected by government agencies or other research organizations, are increasingly used in drug development. Regulatory agencies, public-private partnerships and professional organizations have initiated major programs and released guidance or guidelines to address challenges in the use of real-world evidence (RWE) to inform regulatory decision making. Since statistical research and considerations in the design, analysis, and interpretation of RWE studies is still a nascent area, the ASA Biopharmaceutical Section established a scientific working group to survey the existing practices and identify opportunities in the area. This article reviews the current landscape of using external data such as RWD/RWE to inform clinical study design and analysis. Two companion articles review the landscape of using RWE in medical product label expansion and of causal inference frameworks to generate RWE.
Article
Confounding adjustment plays a key role in designing observational studies such as cross‐sectional studies, case‐control studies, and cohort studies. In this article, we propose a simple method for sample size calculation in observational research in the presence of confounding. The method is motivated by the notion of E‐value, using some bounding factor to quantify the impact of confounders on the effect size. The method can be applied to calculate the needed sample size in observational research when the outcome variable is binary, continuous, or time‐to‐event. The method can be implemented straightforwardly using existing commercial software such as the PASS software. We demonstrate the performance of the proposed method through numerical examples, simulation studies, and a real application, which show that the proposed method is conservative in providing a slightly bigger sample size than what it needs to achieve a given power.
Article
Randomized controlled clinical trials (RCTs) are the gold standard for evaluating the safety and efficacy of pharmaceutical drugs, but in many cases their costs, duration, limited generalizability, and ethical or technical feasibility have caused some to look for real-world studies as alternatives. However, real-world studies may be less convincing due to the lack of randomization and blinding. In this article, we discuss some key considerations in the design of real-world studies, which include experimental studies (e.g., hybrid or pragmatic clinical trials and non-randomized single-arm clinical trials with external controls) and non-experimental studies (e.g., cohort studies, cross-sectional studies, and case-control studies). Causal inference plays a critical role in the derivation of robust real-world evidence (RWE) from the analysis of real-world data (RWD). Therefore, we apply the hypothetical strategy, along with the concept of potential outcome, to lay out these key considerations, and we hope these considerations are helpful for the design, conduct, and analysis of real-world studies.
Article
Randomized controlled clinical trials are the gold standard for evaluating the safety and efficacy of pharmaceutical drugs, but in many cases their costs, duration, limited generalizability, and ethical or technical feasibility have caused some to look for real-world studies as alternatives. On the other hand, real-world data may be much less convincing due to the lack of randomization and the presence of confounding bias. In this article, we propose a statistical roadmap to translate real-world data (RWD) to robust real-world evidence (RWE). The Food and Drug Administration (FDA) is working on guidelines, with a target to release a draft by 2021, to harmonize RWD applications and monitor the safety and effectiveness of pharmaceutical drugs using RWE. The proposed roadmap aligns with the newly released framework for FDA’s RWE Program in December 2018 and we hope this statistical roadmap is useful for statisticians who are eager to embark on their journeys in the real-world research.
Article
The US Food and Drug Administration (FDA) has shown scientific discretion in interpreting the substantial evidence requirement for the approval of new drugs with its considerations on the use of single controlled or uncontrolled trials (Federal Food, Drug, and Cosmetic Act § 505(d), 21 USC 355(d), 1962). With the passage of the 21st Centuries Cures Act (21st Century Cures-patients. House, Energy and Commerce Committee, Washington, DC, 2019 available at: https://energycommerce.house.gov/sites/republicans.energycommerce.house.gov/files/analysis/21stCenturyCures/20140516PatientsWhitePaper.pdf), the FDA is mandated to expand the role of real-world evidence (RWE) in support of drug approval. This mandate further broadens the scope of scientific discretion to include data collected outside clinical trials. We summarize the agency’s past acceptance of real-world data (RWD) sources for supporting drug approval in new indications which have been reflected in US labels. In our summary, we focus on the type of RWD and statistical methodologies presented in these labels. Furthermore, two labels were selected for in-depth assessment of the RWE presented in these labels. Through these examples, we demonstrate the issues that can be raised in data collection that could affect interpretation. In addition, a brief discussion of statistical methods that can be used to incorporate RWE to clinical development is presented.
Article
Pragmatic clinical trials often entail the use of electronic health record (EHR) and claims data, but bias and quality issues associated with these data can limit their fitness for research purposes particularly for study end points. Patient-reported health (PRH) data can be used to confirm or supplement EHR and claims data in pragmatic trials, but these data can bring their own biases. Moreover, PRH data can complicate analyses if they are discordant with other sources. Using experience in the design and conduct of multi-site pragmatic trials, we itemize the strengths and limitations of PRH data and identify situational criteria for determining when PRH data are appropriate or ideal to fill gaps in the evidence collected from EHRs. To provide guidance for the scientific rationale and appropriate use of patient-reported data in pragmatic clinical trials, we describe approaches for ascertaining and classifying study end points and addressing issues of incomplete data, data alignment, and concordance. We conclude by identifying areas that require more research.
Article
Open data sharing and access has the potential to promote transparency and reproducibility in research, contribute to education and training, and prompt innovative secondary research. Yet, there are many reasons why researchers don’t share their data. These include, among others, time and resource constraints, patient data privacy issues, lack of access to appropriate funding, insufficient recognition of the data originators’ contribution, and the concern that commercial or academic competitors may benefit from analyses based on shared data. Nevertheless, there is a positive interest within and across the research and patient communities to create shared data resources. In this perspective, we will try to highlight the spectrum of “openness” and “data access” that exists at present and highlight the strengths and weakness of current data access platforms, present current examples of data sharing platforms, and propose guidelines to revise current data sharing practices going forward.
Article
Purpose: Observational pharmacoepidemiological studies can provide valuable information on the effectiveness or safety of interventions in the real world, but one major challenge is the existence of unmeasured confounder(s). While many analytical methods have been developed for dealing with this challenge, they appear under-utilized, perhaps due to the complexity and varied requirements for implementation. Thus, there is an unmet need to improve understanding the appropriate course of action to address unmeasured confounding under a variety of research scenarios. Methods: We implemented a stepwise search strategy to find articles discussing the assessment of unmeasured confounding in electronic literature databases. Identified publications were reviewed and characterized by the applicable research settings and information requirements required for implementing each method. We further used this information to develop a best practice recommendation to help guide the selection of appropriate analytical methods for assessing the potential impact of unmeasured confounding. Results: Over 100 papers were reviewed, and 15 methods were identified. We used a flowchart to illustrate the best practice recommendation which was driven by 2 critical components: (1) availability of information on the unmeasured confounders; and (2) goals of the unmeasured confounding assessment. Key factors for implementation of each method were summarized in a checklist to provide further assistance to researchers for implementing these methods. Conclusion: When assessing comparative effectiveness or safety in observational research, the impact of unmeasured confounding should not be ignored. Instead, we suggest quantitatively evaluating the impact of unmeasured confounding and provided a best practice recommendation for selecting appropriate analytical methods.
Article
The FDA Reauthorization Act of 2017 includes the sixth version of the Prescription Drug User Fee Act. User fees have accelerated drug approvals in the context of inadequate funding of the FDA, and industry now pays 75% of the costs of the scientific review of drugs.
Article
Background: Randomized controlled trials provide robust data on the efficacy of interventions rather than on effectiveness. Health technology assessment (HTA) agencies worldwide are thus exploring whether real-world data (RWD) may provide alternative sources of data on effectiveness of interventions. Presently, an overview of HTA agencies' policies for RWD use in relative effectiveness assessments (REA) is lacking. Objectives: To review policies of six European HTA agencies on RWD use in REA of drugs. A literature review and stakeholder interviews were conducted to collect information on RWD policies for six agencies: the Dental and Pharmaceutical Benefits Agency (Sweden), the National Institute for Health and Care Excellence (United Kingdom), the Institute for Quality and Efficiency in Healthcare (Germany), the High Authority for Health (France), the Italian Medicines Agency (Italy), and the National Healthcare Institute (The Netherlands). The following contexts for RWD use in REA of drugs were reviewed: initial reimbursement discussions, pharmacoeconomic analyses, and conditional reimbursement schemes. We identified 13 policy documents and 9 academic publications, and conducted 6 interviews. Results: Policies for RWD use in REA of drugs notably differed across contexts. Moreover, policies differed between HTA agencies. Such variations might discourage the use of RWD for HTA. Conclusions: To facilitate the use of RWD for HTA across Europe, more alignment of policies seems necessary. Recent articles and project proposals of the European network of HTA may provide a starting point to achieve this.
Article
Objective: Objective and reproducible evaluation of data quality is of paramount importance for studies of 'real-world' observational data. Here, we summarise a standardised data quality, density and generalisability process implemented by MSBase, a global multiple sclerosis (MS) cohort study. Methods: Error rate, data density score and generalisability score were developed using all 35,869 patients enrolled in MSBase as of November 2015. The data density score was calculated across six domains (follow-up, demography, visits, MS relapses, paraclinical data and therapy) and emphasised data completeness. The error rate evaluated syntactic accuracy and consistency of data. The generalisability score evaluated believability of the demographic and treatment information. Correlations among the three scores and the number of patients per centre were evaluated. Results: Errors were identified at the median rate of 3 per 100 patient-years. The generalisability score indicated the samples' representativeness of the known MS epidemiology. Moderate correlation between the density and generalisability scores (ρ = 0.58) and a weak correlation between the error rate and the other two scores (ρ = -0.32 to -0.33) were observed. The generalisability score was strongly correlated with centre size (ρ = 0.79). Conclusion: The implemented scores enable objective evaluation of the quality of observational MS data, with an impact on the design of future analyses.
Article
Objective: Electronic health records (EHRs) are an increasingly common data source for clinical risk prediction, presenting both unique analytic opportunities and challenges. We sought to evaluate the current state of EHR based risk prediction modeling through a systematic review of clinical prediction studies using EHR data. Methods: We searched PubMed for articles that reported on the use of an EHR to develop a risk prediction model from 2009 to 2014. Articles were extracted by two reviewers, and we abstracted information on study design, use of EHR data, model building, and performance from each publication and supplementary documentation. Results: We identified 107 articles from 15 different countries. Studies were generally very large (median sample size = 26 100) and utilized a diverse array of predictors. Most used validation techniques (n = 94 of 107) and reported model coefficients for reproducibility (n = 83). However, studies did not fully leverage the breadth of EHR data, as they uncommonly used longitudinal information (n = 37) and employed relatively few predictor variables (median = 27 variables). Less than half of the studies were multicenter (n = 50) and only 26 performed validation across sites. Many studies did not fully address biases of EHR data such as missing data or loss to follow-up. Average c-statistics for different outcomes were: mortality (0.84), clinical prediction (0.83), hospitalization (0.71), and service utilization (0.71). Conclusions: EHR data present both opportunities and challenges for clinical risk prediction. There is room for improvement in designing such studies.
Chapter
Accurate enrollment information is critical for timely decision-making and execution for clinical trials. Enrollment must be carefully planned and monitored in order to maximize business benefit and to achieve study objectives. This is particularly true in adaptive designs (AD) trials, where too slow or too fast patient enrollment along with inaccurate enrollment prediction will imperil the timing of and/or invalidate the planned adaptations in AD trials. This chapter will discuss the key considerations for patient enrollment management and present and discuss different patient recruitment models.
Article
Due to the special nature of medical device clinical studies, observational (non-randomized) comparative studies play important roles in the pre-market safety/effectiveness evaluation of medical devices. While historical data collected in earlier investigational device exemption studies of a previously approved medical device have been used to form control groups in comparative studies, high quality registry data are emerging to provide opportunities for the pre-market evaluation of new devices. However, in such studies, various biases could be introduced in every stage and aspect of study and may compromise the objectivity of study design and validity of study results. In this paper, challenges and opportunities in the design of such studies using propensity score methodology are discussed from regulatory perspectives.
Article
While there is growing demand for information about comparative effectiveness (CE), there is substantial debate about whether and when observational studies have sufficient quality to support decision making. To develop and test an item checklist that can be used to qualify those observational CE studies sufficiently rigorous in design and execution to contribute meaningfully to the evidence base for decision support. An 11-item checklist about data and methods (the GRACE checklist) was developed through literature review and consultation with experts from professional societies, payer groups, the private sector, and academia. Since no single gold standard exists for validation, checklist item responses were compared with 3 different types of external quality ratings (N=88 articles). The articles compared treatment effectiveness and/or safety of drugs, medical devices, and medical procedures. We validated checklist item responses 3 ways against external quality ratings, using published articles of observational CE or safety studies: (a) Systematic Review-quality assessment from a published systematic review; (b) Single Expert Review-quality assessment made according to the solicited "expert opinion" of a senior researcher; and (c) Concordant Expert Review-quality assessments from 2 experts for which there was concordance. Volunteers (N=113) from 5 continents completed 280 article assessments using the checklist. Positive and negative predictive values (PPV, NPV, respectively) of individual items were estimated to compare testers' assessments with those of experts. Taken as a whole, the scale had better NPV than PPV, for both data and methods. The most consistent predictor of quality relates to the validity of the primary outcomes measurement for the study purpose. Other consistent markers of quality relate to using concurrent comparators, minimizing the effects of bias by prudent choice of covariates, and using sensitivity analysis to test robustness of results. Concordance of expert opinion on the quality of the rated articles was 52%; most checklist items performed better. The 11-item GRACE checklist provides guidance to help determine which observational studies of CE have used strong scientific methods and good data that are fit for purpose and merit consideration for decision making. The checklist contains a parsimonious set of elements that can be objectively assessed in published studies, and user testing shows that it can be successfully applied to studies of drugs, medical devices, and clinical and surgical interventions. Although no scoring is provided, study reports that rate relatively well across checklist items merit in-depth examination to understand applicability, effect size, and likelihood of residual bias. The current testing and validation efforts did not achieve clear discrimination between studies fit for purpose and those not, but we have identified a critical, though remediable, limitation in our approach. Not specifying a specific granular decision for evaluation, or not identifying a single study objective in reports that included more than one, left reviewers with too broad an assessment challenge. We believe that future efforts will be more successful if reviewers are asked to focus on a specific objective or question. Despite the challenges encountered in this testing, an agreed upon set of assessment elements, checklists, or score cards is critical for the maturation of this field. Substantial resources will be expended on studies of real-world effectiveness, and if the rigor of these observational assessments cannot be assessed, then the impact of the studies will be suboptimal. Similarly, agreement on key elements of quality will ensure that budgets are appropriately directed toward those elements. Given the importance of this task and the lessons learned from these extensive efforts at validation and user testing, we are optimistic about the potential for improved assessments that can be used for diverse situations by people with a wide range of experience and training. Future testing would benefit by directing reviewers to address a single, granular research question, which would avoid problems that arose by using the checklist to evaluate multiple objectives, by using other types of validation test sets, and by employing further multivariate analysis to see if any combination or sequence of item responses has particularly high predictive validity.
Book
During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.
Article
Although Berkson's bias is widely recognized in the epidemiologic literature, it remains underappreciated as a model of both selection bias and bias due to missing data. Simple causal diagrams and 2 × 2 tables illustrate how Berkson's bias connects to collider bias and selection bias more generally, and show the strong analogies between Berksonian selection bias and bias due to missing data. In some situations, considerations of whether data are missing at random or missing not at random are less important than the causal structure of the missing data process. Although dealing with missing data always relies on strong assumptions about unobserved variables, the intuitions built with simple examples can provide a better understanding of approaches to missing data in real-world situations.
Article
For estimating causal effects of treatments, randomized experiments are generally considered the gold standard. Nevertheless, they are often infeasible to conduct for a variety of reasons, such as ethical concerns, excessive expense, or timeliness. Consequently, much of our knowledge of causal effects must come from non-randomized observational studies. This article will advocate the position that observational studies can and should be designed to approximate randomized experiments as closely as possible. In particular, observational studies should be designed using only background information to create subgroups of similar treated and control units, where 'similar' here refers to their distributions of background variables. Of great importance, this activity should be conducted without any access to any outcome data, thereby assuring the objectivity of the design. In many situations, this objective creation of subgroups of similar treated and control units, which are balanced with respect to covariates, can be accomplished using propensity score methods. The theoretical perspective underlying this position will be presented followed by a particular application in the context of the US tobacco litigation. This application uses propensity score methods to create subgroups of treated units (male current smokers) and control units (male never smokers) who are at least as similar with respect to their distributions of observed background characteristics as if they had been randomized. The collection of these subgroups then 'approximate' a randomized block experiment with respect to the observed covariates.
Characterizing RWD Quality and Relevancy for Regulatory Purposes
  • Duke Margolis
Determining Real-world Data’s Fitness for Use and the Role of Reliability. Duke-Margolis Center for Health Policy
  • Duke Margolis
A Framework for Regulatory Use of Real World Evidence
  • Duke Margolis
Understanding the Need for Non-interventional Studies Using Secondary Data to Generate Real-world Evidence for Regulatory Decision Making, and Demonstrating Their Credibility
  • Duke Margolis
A Roadmap for Developing Study Endpoints in Real-world Settings
  • Duke Margolis
The Current Landscape in Causal Inference Frameworks for Design and Analysis of Studies Using Real-world Data and Evidence
  • M Ho
  • M Van Der Laan
  • H Lee
  • J Chen
  • K Kee
  • Y Fang
  • W He
  • T Irony
  • Q Jiang
  • X Lin
  • Z Meng
  • P Mishra-Kalyani
  • F Rockhold
  • Y Song
  • H Wang
  • R White
“21st Century Cures Act. H.R. 34, 114th Congress
  • Us Congress
Design and Analytic Considerations for Using Patient-reported Health Data in Pragmatic Clinical Trials
  • F W Rockhold
  • J D Tenenbaum
  • R Richesson
  • K A Marsolo
  • E C Brien