Clinical heterogeneity in systematic reviews and health technology assessments: Synthesis of guidance documents and the literature

Danube University, Krems.
International Journal of Technology Assessment in Health Care (Impact Factor: 1.31). 01/2012; 28(1):36-43. DOI: 10.1017/S0266462311000687
Source: PubMed


The aim of this study was to synthesize best practices for addressing clinical heterogeneity in systematic reviews and health technology assessments (HTAs).
We abstracted information from guidance documents and methods manuals made available by international organizations that develop systematic reviews and HTAs. We searched PubMed® to identify studies on clinical heterogeneity and subgroup analysis. Two authors independently abstracted and assessed relevant information.
Methods manuals offer various definitions of clinical heterogeneity. In essence, clinical heterogeneity is considered variability in study population characteristics, interventions, and outcomes across studies. It can lead to effect-measure modification or statistical heterogeneity, which is defined as variability in estimated treatment effects beyond what would be expected by random error alone. Clinical and statistical heterogeneity are closely intertwined but they do not have a one-to-one relationship. The presence of statistical heterogeneity does not necessarily indicate that clinical heterogeneity is the causal factor. Methodological heterogeneity, biases, and random error can also cause statistical heterogeneity, alone or in combination with clinical heterogeneity.
Identifying potential modifiers of treatment effects (i.e., effect-measure modifiers) is important for researchers conducting systematic reviews and HTAs. Recognizing clinical heterogeneity and clarifying its implications helps decision makers to identify patients and patient populations who benefit the most, who benefit the least, and who are at greatest risk of experiencing adverse outcomes from a particular intervention.

9 Reads
  • [Show abstract] [Hide abstract]
    ABSTRACT: Systematic reviews are essential tools for summarizing information to help users make well-informed decisions about health care options.1 The Evidence-based Practice Center (EPC) program, supported by the Agency for Healthcare Research and Quality (AHRQ), produces substantial numbers of such reviews, including those that explicitly compare two or more clinical interventions (sometimes termed comparative effectiveness reviews). These reports synthesize a body of literature; the ultimate goal is to help clinicians, policymakers, and patients make well-considered decisions about health care. The goal of strength of evidence assessments is to provide clearly explained, well-reasoned judgments about reviewers’ confidence in their systematic review conclusions so that decisionmakers can use them effectively.2 Beginning in 2007, AHRQ supported a cross-EPC set of work groups to develop guidance on major elements of designing, conducting, and reporting systematic reviews.3 Together the materials form the EPC Methods Guide for Effectiveness and Comparative Effectiveness Reviews;4 one chapter focused on grading the strength of evidence.5 This chapter updates the original EPC strength of evidence approach,5 presenting findings and recommendations of a work group with experience in applying previous guidance; it should be considered current guidance for EPCs. The guidance applies primarily to systematic reviews of drugs, devices, and other preventive and therapeutic interventions; it may apply to exposures (characteristics or risk factors that are determinants of health outcomes) and broader health services research questions. It does not address reviews of medical tests. EPC reports support the work of many decisionmakers, but EPCs do not themselves develop recommendations or practice guidelines. In particular, we limit our grading strength of evidence approach to individual outcomes. Unlike grading systems that were designed to be used more directly by specific decisionmakers,6–8 we do not develop global summary judgments of the relative benefits and harms of treatment comparisons. We briefly explore the rationale for grading strength of evidence, define domains of concern, and describe our recommended grading system for systematic reviews. The aims of this guidance are twofold: (1) to foster appropriate consistency and transparency in the methods that different EPCs use to grade strength of evidence and (2) to facilitate users’ interpretations of those grades for guideline development or other decisionmaking tasks. Because this field is rapidly evolving, future revisions are anticipated; they will reflect our increasing understanding and experience with the methodology.
    Methods Guide for Effectiveness and Comparative Effectiveness Reviews, Agency for Healthcare Research and Quality (US).
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Measurement of anti-Müllerian hormone has been done in the practice of reproductive medicine for a more accurate prediction of ovarian follicular reserve, being an indirect marker of the quantity and quality of primordial follicles. A significant correlation with antral follicle count, the amount and maturity of oocytes in assisted reproductive techniques have been repeatedly evident in the literature, which is why we believe in the increasingly promising future of this hormone as a marker for the early assessment and prognosis of the infertile patient. In this article, we discuss current information on the role of the marker in the assessment of ovarian reserve in candidates for assisted reproduction techniques.
  • Source

    Journal of clinical epidemiology 08/2013; 66(8):809-11. DOI:10.1016/j.jclinepi.2013.05.009 · 3.42 Impact Factor
Show more