ArticlePDF Available

Abstract and Figures

Previous work has identified 6 important areas to consider when evaluating validity and bias in studies of prognostic factors: participation, attrition, prognostic factor measurement, confounding measurement and account, outcome measurement, and analysis and reporting. This article describes the Quality In Prognosis Studies tool, which includes questions related to these areas that can inform judgments of risk of bias in prognostic research.A working group comprising epidemiologists, statisticians, and clinicians developed the tool as they considered prognosis studies of low back pain. Forty-three groups reviewing studies addressing prognosis in other topic areas used the tool and provided feedback. Most reviewers (74%) reported that reaching consensus on judgments was easy. Median completion time per study was 20 minutes; interrater agreement (κ statistic) reported by 9 review teams varied from 0.56 to 0.82 (median, 0.75). Some reviewers reported challenges making judgments across prompting items, which were addressed by providing comprehensive guidance and examples. The refined Quality In Prognosis Studies tool may be useful to assess the risk of bias in studies of prognostic factors.
Content may be subject to copyright.
Assessing Bias in Studies of Prognostic Factors
Jill A. Hayden, DC, PhD; Danielle A. van der Windt, PhD; Jennifer L. Cartwright, MSc; Pierre Coˆte´, DC, PhD; and Claire Bombardier, MD
Previous work has identified 6 important areas to consider when
evaluating validity and bias in studies of prognostic factors: partic-
ipation, attrition, prognostic factor measurement, confounding
measurement and account, outcome measurement, and analysis
and reporting. This article describes the Quality In Prognosis Studies
tool, which includes questions related to these areas that can in-
form judgments of risk of bias in prognostic research.
A working group comprising epidemiologists, statisticians, and
clinicians developed the tool as they considered prognosis studies of
low back pain. Forty-three groups reviewing studies addressing
prognosis in other topic areas used the tool and provided feedback.
Most reviewers (74%) reported that reaching consensus on judg-
ments was easy. Median completion time per study was 20 min-
utes; interrater agreement (
statistic) reported by 9 review teams
varied from 0.56 to 0.82 (median, 0.75). Some reviewers reported
challenges making judgments across prompting items, which were
addressed by providing comprehensive guidance and examples.
The refined Quality In Prognosis Studies tool may be useful to
assess the risk of bias in studies of prognostic factors.
Ann Intern Med. 2013;158:280-286. www.annals.org
For author affiliations, see end of text.
Well-conducted prognostic research is important for
clinical decision making. It informs patients about
possible outcomes, identifies risk groups for stratified man-
agement, and helps target specific prognostic factors for
modification (1). However, previous research shows many
methodological shortcomings in the design and conduct of
studies that address prognosis (2–4).
Critical appraisal of prognostic studies is essential to
assess and identify biases sufficiently large to distort study
results. A tool to guide such critical appraisal would help
reviewers conducting systematic reviews and developing
clinical practice guidelines, researchers conducting primary
studies, and readers of such studies.
During assessment of risk of bias, 6 important do-
mains should be considered when evaluating validity and
bias in studies of prognostic factors: study participation,
study attrition, prognostic factor measurement, confound-
ing measurement and account, outcome measurement, and
analysis and reporting (1). Researchers have used these rec-
ommendations to guide design and conduct of primary
prognosis studies (5, 6) and as a guideline to improve re-
porting (6). In this article, we describe the refinement and
use of the Quality In Prognosis Studies (QUIPS) tool to
assess risk of bias in studies of prognostic factors.
METHODS
Development of the QUIPS Tool
The Figure shows a schematic of the project. Fourteen
working group members, including epidemiologists, statis-
ticians, and clinicians, collaborated in tool development
(7). The working group used an e-mail–based, modified
Delphi approach (8) and nominal group techniques to re-
fine prompting items for assessing bias domains and pro-
posed ratings for the bias assessments as they considered
prognosis studies of low back pain.
During an in-person workshop in 2006 that included
working group members and other participants, a facilita-
tor presented issues of agreement or dissent related to as-
sessment of the bias domains. Through an iterative process
of discussion and voting, workshop participants reached
consensus on the wording of prompting items to guide
ratings of high, moderate, or low risk of bias related to the
6 domains. These recommendations were formatted as a
paper and an electronic tool and were used to assess risk of
bias in studies included in a systematic review of prognostic
factors in back pain (9). An overlapping group of 22 ex-
perts further discussed and refined the tool before and dur-
ing a workshop in 2007.
Use of the Tool and Feedback
Since 2007, preliminary versions and subsequently a
refined electronic version of the QUIPS tool were shared
with and adapted by other research teams conducting sys-
tematic reviews of studies addressing prognosis, including
review teams in rheumatology (10), cardiovascular disease
(11, 12), and kidney disease (13, 14). We then used a
structured Web-based survey to solicit feedback from 83
research teams that had used the QUIPS tool. Potential
authors were identified by using a citation search in
PubMed for the original 2006 QUIPS paper (1) and by
reviewing personal communications that the primary in-
vestigator had received (Figure).
The survey was constructed using Opinio (Object-
Planet, Oslo, Norway). We collected information on the
characteristics of the systematic reviews that had used the
QUIPS tool (such as topic area and review status), charac-
teristics of the review teams (number of reviewers involved
in the quality assessment process), how the tool was used
(domains used, aspects of the tool used for quality assess-
ment, and risk of bias judgments), its perceived ease of use
(time to complete an assessment by using the tool and
See also:
Web-Only
Supplement
Annals of Internal MedicineResearch and Reporting Methods
280 © 2013 American College of Physicians
Downloaded From: http://annals.org/ by a Capital Hlth Halifax Infirmary User on 02/19/2013
problems encountered), and any suggested modifications.
A copy of the complete survey is available on request.
Role of the Funding Source
There was no direct funding for this project.
RESULTS
The QUIPS Tool
The Table summarizes the 6 bias domains, prompting
items and considerations for each domain, and overall rat-
ing assessments. The Supplement, available at www.annals
.org, shows the full version of the QUIPS tool.
The Study Participation domain addresses the repre-
sentativeness of the study sample. It helps the assessor
judge whether the study’s reported association is a valid
estimate of the true relationship between the prognostic
factor and the outcome of interest in the source popula-
tion. To make this judgment, the assessor considers the
proportion of eligible persons who participate in the study,
as well as descriptions of the source population, baseline
study sample, sampling frame and recruitment, and inclu-
sion and exclusion criteria. A study would be considered as
having high risk of bias if the participation rate is low, the
study sample has a very different age and sex distribution
from the source population, or a very selective rather than
consecutive sample of eligible patients was recruited. Con-
versely, studies with high participation of eligible and con-
secutively recruited patients who have characteristics simi-
lar to those in the source population would have low risk of
bias.
The Study Attrition domain addresses whether partic-
ipants with follow-up data represent persons enrolled in
the study. It helps the assessor judge whether the reported
Figure. Schematic of the project from 2006 through 2011 to develop and assess the QUIPS tool for assessing risk of bias in
prognostic factor studies.
Survey responses (n = 43)
Survey sent through Opinio* and by personal
e-mail to identified authors (n = 83)
Hayden et al (1) publication: recommendations to assess
6 bias domains and relevant considerations
Activities of working group to refine
prompting items and propose ratings
Distribution of QUIPS tool to review groups
by word of mouth (n = 23)
Identification of review
teams (duplicates
removed); author
contact information
identified (n = 60)
Selection of studies that
used the tool to assess
risk of bias (n = 80)
Facilitated discussion workshop: consensus on
wording of prompting items to guide risk of bias ratings
Development of paper copy and electronic
QUIPS tool
Citation search in
PubMed for systematic
reviews citing
Hayden et al (1) (n = 97)
We selected review teams for the survey if they conducted a prognosis systematic review, cited Hayden and colleagues (1) with reference to critical
appraisal of included studies, and used a tool that sufficiently resembled the QUIPS tool (that is, included at least 4 of 6 domains of the QUIPS tool).
QUIPS Quality In Prognosis Studies.
*ObjectPlanet, Oslo, Norway.
Research and Reporting MethodsAssessing Bias in Studies of Prognostic Factors
www.annals.org 19 February 2013 Annals of Internal Medicine Volume 158 • Number 4 281
Downloaded From: http://annals.org/ by a Capital Hlth Halifax Infirmary User on 02/19/2013
association between the prognostic factor and outcome is
biased by the assessment of outcomes in a selected group of
participants who completed the study. To make this judg-
ment the assessor considers the study withdrawal rate (that
is, whether many participants withdrew and whether there
is a higher risk for systematic differences that may bias the
prognostic factor association), information about why par-
ticipants were lost to follow-up (that is, there is less con-
cern if all persons provide random explanations), and ob-
served differences in characteristics of persons lost to
follow-up compared with participants who completed the
study.
A study would be considered to have high risk of bias
if it is probable that persons who completed the study
differ from those lost to follow-up in a way that distorts the
association between the prognostic factor and outcome.
Conversely, studies with complete follow-up, or evidence
of participants missing at random, have low risk of bias.
Table. Summary of the Bias Domains, Prompting Items, and Ratings of the QUIPS Tool*
Variable Bias Domains
1. Study Participation 2. Study Attrition 3. Prognostic Factor
Measurement
4. Outcome
Measurement
Optimal study or
characteristics of
unbiased study
The study sample adequately
represents the population
of interest
The study data available (i.e.,
participants not lost to follow-up)
adequately represent the study
sample
The PF is measured in a
similar way for all
participants
The outcome of interest is
measured in a similar
way for all participants
Prompting items and
considerations†
a. Adequate participation in
the study by eligible
persons
a. Adequate response rate for study
participants
a. A clear definition or
description of the PF
is provided
a. A clear definition of the
outcome is provided
b. Description of the source
population or population
of interest
b. Description of attempts to collect
information on participants who
dropped out
b. Method of PF
measurement is
adequately valid and
reliable
b. Method of outcome
measurement used is
adequately valid and
reliable
c. Description of the baseline
study sample
c. Reasons for loss to follow-up are
provided
c. Continuous variables
are reported or
appropriate cut
points are used
c. The method and setting
of outcome
measurement is the
same for all study
participants
d. Adequate description of
the sampling frame and
recruitment
d. Adequate description of
participants lost to follow-up
d. The method and
setting of
measurement of PF is
the same for all study
participants
e. Adequate description of
the period and place of
recruitment
e. There are no important differences
between participants who
completed the study and those
who did not
e. Adequate proportion
of the study sample
has complete data for
the PF
f. Adequate description of
inclusion and exclusion
criteria
f. Appropriate methods
of imputation are
used for missing PF
data
Ratings‡
High risk of bias The relationship between the
PF and outcome is very
likely to be different for
participants and eligible
nonparticipants
The relationship between the PF and
outcome is very likely to be
different for completing and
noncompleting participants
The measurement of
the PF is very likely
to be different for
different levels of the
outcome of interest
The measurement of the
outcome is very likely
to be different related
to the baseline level of
the PF
Moderate risk of bias The relationship between the
PF and outcome may be
different for participants
and eligible
nonparticipants
The relationship between the PF and
outcome may be different for
completing and noncompleting
participants
The measurement of
the PF may be
different for different
levels of the outcome
of interest
The measurement of the
outcome may be
different related to the
baseline level of the PF
Low risk of bias The relationship between the
PF and outcome is unlikely
to be different for
participants and eligible
nonparticipants
The relationship between the PF and
outcome is unlikely to be different
for completing and noncompleting
participants
The measurement of
the PF is unlikely to
be different for
different levels of the
outcome of interest
The measurement of the
outcome is unlikely to
be different related to
the baseline level of the
PF
PF prognostic factor; QUIPSQuality In Prognosis Studies.
*The Supplement (available at www.annals.org) shows the full QUIPS tool.
Prompting items are to guide the user’s judgment about risk of bias for each domain and are taken together to inform the overall judgment of potential bias and facilitate
consensus among reviewers for each of the 6 domains. Some items may not be relevant to the specific study or the review research question; modification/clarification of the
prompting items for the specific review question is encouraged.
Each domain is rated as high, moderate, or low risk of bias considering the prompting items.
Research and Reporting Methods Assessing Bias in Studies of Prognostic Factors
282 19 February 2013 Annals of Internal Medicine Volume 158 • Number 4 www.annals.org
Downloaded From: http://annals.org/ by a Capital Hlth Halifax Infirmary User on 02/19/2013
The Prognostic Factor Measurement domain addresses
adequacy of prognostic factor measurement. It helps the
assessor judge whether the study measured the prognostic
factor in a similar, valid, and reliable way for all partici-
pants. To make this judgment, the assessor considers the
clarity of the definition of the prognostic factor, evidence
on the validity and reliability of the measurement ap-
proach, and the similarity of measurement and appropriate
reporting of the prognostic factor for all participants. In-
formation considered may include outside sources on mea-
surement properties, blind or independent measurement,
and limited reliance on recall.
A study would be considered to have low risk of bias if
the prognostic factor is measured similarly for all partici-
pants and uses a valid, reliable measure. Conversely, studies
that use an unreliable method to measure the prognostic
factor or use different approaches for participants that re-
sult in systematic misclassification have high risk of bias.
The Outcome Measurement domain addresses the ad-
equacy of outcome measurement. It helps the assessor
judge whether the study measured the outcome in a simi-
lar, reliable, and valid way for all participants. To make this
judgment, the assessor considers the clarity of outcome
definition, evidence on the validity and reliability of the
measurement, and similarity of measurement (that is, sim-
ilar setting, method of measurement, and follow-up dura-
tion) for different levels of the prognostic factor. Informa-
tion considered may include relevant outside sources on
measurement properties, blind measurement, and confir-
mation of outcome with another valid and reliable test to
support a judgment.
A study would have high risk of bias if there is likely to
be differential measurement of outcome related to the ex-
tent of exposure to the prognostic factor; for example, if
cardiovascular outcomes are assessed more extensively in
smokers than in nonsmokers. A study would be considered
to have low risk of bias if the outcome is measured simi-
larly for all participants and uses a valid, reliable measure.
The Study Confounding domain addresses potential
confounding factors. It helps the assessor judge whether
another factor may explain the study’s reported association.
To make this judgment, the assessor considers the validity,
reliability, and similarity of measurement of potential con-
founders (defined a priori) for all participants and whether
all important confounding factors are accounted for in the
study design or analysis.
A study would have high risk of bias if another factor
related to both the prognostic factor and the outcome is
likely to explain the effect of the prognostic factor. Con-
versely, studies with adequate measurement of important
potential confounding variables and inclusion of these vari-
ables in a prespecified multivariable analysis have low risk
of bias.
The Statistical Analysis and Reporting domain ad-
dresses the appropriateness of the study’s statistical analysis
and completeness of reporting. It helps the assessor judge
whether results are likely to be spurious or biased because
of analysis or reporting. To make this judgment, the asses-
sor considers the data presented to determine the adequacy
of the analytic strategy and model-building process and
investigates concerns about selective reporting. Selective re-
porting is an important issue in prognostic factor reviews
because studies commonly report only factors positively
associated with outcomes. A study would be considered to
have low risk of bias if the statistical analysis is appropriate
for the data, statistical assumptions are satisfied, and all
primary outcomes are reported.
Table —Continued
Bias Domains
5. Study Confounding 6. Statistical Analysis and
Reporting
Important potential confounding
factors are appropriately
accounted for
The statistical analysis is
appropriate, and all primary
outcomes are reported
a. All important confounders are
measured
a. Sufficient presentation of data
to assess the adequacy of the
analytic strategy
b. Clear definitions of the
important confounders
measured are provided
b. Strategy for model building is
appropriate and is based on a
conceptual framework or
model
c. Measurement of all important
confounders is adequately
valid and reliable
c. The selected statistical model
is adequate for the design of
the study
d. The method and setting of
confounding measurement are
the same for all study
participants
d. There is no selective reporting
of results
e. Appropriate methods are used
if imputation is used for
missing confounder data
f. Important potential
confounders are accounted
for in the study design
g. Important potential
confounders are accounted
for in the analysis
The observed effect of the PF
on the outcome is very likely
to be distorted by another
factor related to PF and
outcome
The reported results are very
likely to be spurious or biased
related to analysis or reporting
The observed effect of the PF
on outcome may be distorted
by another factor related to
PF and outcome
The reported results may be
spurious or biased related to
analysis or reporting
The observed effect of the PF
on outcome is unlikely to be
distorted by another factor
related to PF and outcome
The reported results are unlikely
to be spurious or biased
related to analysis or reporting
Research and Reporting MethodsAssessing Bias in Studies of Prognostic Factors
www.annals.org 19 February 2013 Annals of Internal Medicine Volume 158 • Number 4 283
Downloaded From: http://annals.org/ by a Capital Hlth Halifax Infirmary User on 02/19/2013
Using the Tool
For each of the 6 domains in the QUIPS tool, re-
sponses to the prompting items are taken together to in-
form the judgment of risk of bias. Information and meth-
odological comments supporting the item assessment
should be recorded (cited directly from the study publica-
tion). Judgments should be made with consensus among at
least 2 assessors. Some items may not be relevant to the
specific study or the review question and may be skipped
or omitted. For example, if a study has a 100% response
rate, the prompting items in the Study Attrition domain
related to collection of information on participants who
dropped out of the study, reasons for loss to follow-up, and
description and comparison of key characteristics of partic-
ipants lost to follow-up with study completers are not
relevant.
To grade the tool, each of the 6 potential bias domains
is rated as having high, moderate, or low risk of bias. For
example, with respect to the Study Attrition domain, study
A reported an 80% response rate (20% of the study sample
lost to follow-up); the authors tried to determine reasons
for noncompletion, collected and presented information
about key characteristics of those lost to follow-up, and
found no differences between completers and noncom-
pleters on important characteristics and outcomes. This
study was rated as having low risk of bias due to study
attrition. Study B, however, would be judged as having
high risk of bias due to attrition with the same 80% re-
sponse rate if important systematic differences existed be-
tween participants who did and those who did not com-
plete the study. Finally, study C would be judged as having
low Risk of Bias due to attrition with only the information
that 99% of a large study sample completed outcome
assessment.
Assessing the overall risk of bias in each study may also
be useful. To judge overall risk, one could describe studies
with a low risk of bias as those in which all, or the most
important (as determined a priori), of the 6 important bias
domains are rated as having low risk of bias. We recom-
mend use of sensitivity analyses to explore the effect of the
selected definition. In line with the Cochrane Risk of Bias
tool for intervention studies (15) and the QUADAS-2
(Quality Assessment of Diagnostic Accuracy Studies) tool
for diagnostic studies (16), we recommend against the use
of a summated score for overall study quality.
Feedback From Reviewers
Forty-three of the 83 review authors invited to provide
feedback on the QUIPS tool did so (Figure). The reviews
came from diverse topic areas, including musculoskeletal
disorders (13 of 43 review teams), obstetrics and pediatrics
(7 of 43), heart or vascular disease (6 of 43), and cancer (4
of 43). Most focused on prognostic factors (28 of 43 re-
views), although some examined overall prognosis (6 of
43), risk prediction models (9 of 43), or differential treat-
ment effect by prognostic factors (2 of 43).
Appendix Table 1 (available at www.annals.org)
shows the experiences of the researchers who used the
QUIPS tool. Most review teams had 2 reviewers indepen-
dently complete risk of bias assessments and used consen-
sus processes to resolve disagreements. Most review teams
(28 of 38) reported that the process of reaching consensus
on assessments was “easy.” Interrater agreement, reported
as percentage of agreement by 9 review teams (10, 17–24)
on 205 studies (reported in peer-reviewed publications or
by personal communication), varied between 70% and
89.5% (median, 83.5%).
The
statistic for independent rating of QUIPS
items, reported by 9 review teams (10, 19, 23, 25–30) on
159 studies, varied from 0.56 to 0.82 (median, 0.75). One
review team (31 studies) (25) reported interrater agreement
scores for individual bias domains: study participation
(
0.73), study attrition (
1.0), prognostic factor
measurement (
1.0), confounding measurement and
account (
0.4), outcome measurement (
0.73), and
analysis and reporting (
0.73).
Review teams reported that using the QUIPS tool
took a median of 20 minutes per study; 5 reviewers re-
ported that it took their team longer than 1 hour per study.
Many review teams included members with specific train-
ing or education to complete the assessments.
Most review teams (32 of 42) used versions of the
QUIPS tool that were developed from recommendations
in Hayden and colleagues’ article (1). Ten review teams
had access to the refined electronic QUIPS tool. One team
described combining the QUIPS recommendations with
the items from the Reporting Recommendations for Tu-
mour Marker Prognostic Studies reporting guidelines (31),
and 2 review groups referred to versions from other authors
(for example, the National Institute for Health and Clini-
cal Excellence guideline manual [32]). Fifteen groups
did not judge risk of bias for the 6 domains but rather
rated only the prompting items. Approximately half of
the reviewers (15 of 34) reported using a count or an al-
gorithm of the prompting items, and half (16 of 34) used
judgment considering prompting items to rate the domain
and overall risk of bias (Appendix Table 2, available at
www.annals.org).
The results of the risk of bias assessments were pre-
sented and used in various ways. The most common ap-
proaches were to present individual prompting item ratings
for each included study and additionally report an assess-
ment of overall study quality. Two reviewers presented no
critical appraisal results in their reviews.
Although feedback was positive, some reviewers re-
ported challenges. Two review teams reported that they
had difficulty making judgments across multiple prompt-
ing items, and 2 review groups commented that poor re-
porting in their included studies made judgment difficult.
Seventeen review teams reported that they had advanced
epidemiologic training for assessors. Seven review teams,
using the tool for types of prognosis reviews other than
Research and Reporting Methods Assessing Bias in Studies of Prognostic Factors
284 19 February 2013 Annals of Internal Medicine Volume 158 • Number 4 www.annals.org
Downloaded From: http://annals.org/ by a Capital Hlth Halifax Infirmary User on 02/19/2013
prognostic factor reviews, commented that they modified
the tool by adding items or removing unnecessary items or
domains.
DISCUSSION
The QUIPS tool supports a systematic appraisal of
bias in studies of prognostic factors. It is based on recom-
mendations from a comprehensive review of quality assess-
ment in prognosis systematic reviews (1) and is informed
by basic epidemiologic principles. Independently devel-
oped and modified versions of the tool have been success-
fully used by several research groups, with moderate to
substantial interrater reliability.
We previously found that quality assessment in prog-
nosis systematic reviews is inconsistent and often incom-
plete (1). A recent review of quality assessment in chronic
disease epidemiology systematic reviews similarly found
that only 55% of included reviews reported quality assess-
ment (33). Sanderson and associates (34) reviewed pub-
lished tools that assess risk of bias in observational epide-
miology studies. Similar to our original review of quality
assessment tools used in prognosis systematic reviews, they
reported a lack of suitable tools (34). The QUIPS tool that
we developed fills this gap and includes a comprehensive
set of prompting items with clear suggestions for opera-
tionalization and grading.
Some review groups participating in this study com-
mented on the need to modify and refine the prompting
items and eliminate some overlap of items. We encourage
operationalization of the tool for specific purposes, includ-
ing specifying key characteristics (for example, potential
confounders), omitting any irrelevant prompting items,
and adding new items where needed. Clear specification of
the tool items will probably increase interrater agreement.
For systematic reviews, operationalization of the tool
should be done a priori and authors should make their
application of the tool accessible to readers of their pub-
lished article.
The QUIPS tool was designed to assess prognostic
factor studies; however, it can provide a starting point for
development or refinement of quality assessment tools for
other types of prognostic studies. For example, it may be
modified to assess studies of overall prognosis (such as
Moulaert and coworkers’ systematic review [18]) by omit-
ting domains related to prognostic factor measurement and
confounding, along with slight adjustments to the prompt-
ing questions for the analysis domain.
Several review groups using the QUIPS tool reported
counting prompting items as a scale. We recommend the
assessment of prompting items to guide judgment of the 6
bias domains rather than using them as a scale. This ap-
proach involves balancing information about competing
design or conduct features and is more transparent. How-
ever, we acknowledge that such a consensus-based judg-
ment of potential bias is more challenging and requires
assessors to be knowledgeable of epidemiologic methods.
Online training tools and examples using the QUIPS tool
should be developed to support training needs.
Our study has limitations. The group of experts who
developed the tool were from a single topic area, poten-
tially limiting generalizability. Furthermore, participants in
our retrospective survey about the tool and our reported
reliability scores were from a selected group of interested
systematic reviewers. Our users probably have more ad-
vanced training and may overestimate usability and reli-
ability scores for the wider population of potential users.
Future studies should further evaluate the QUIPS tool
by using a prospective study design. Reliability testing
should be done on a larger, more representative set of stud-
ies and tool users, including assessing reliability of individ-
ual domain ratings, as well as consensus ratings between
groups. Exploring the effect of study-level factors on reli-
ability of bias appraisal by using the QUIPS tool will also
help identify potential problem areas in need of further
guidance (35).
The relationship between domain ratings and prog-
nostic factor associations to provide empirical evidence of
design-related bias (that is, evidence of over- or underesti-
mation of prognostic factor associations with judgments of
increased bias related to each of the domains) needs to be
examined. Our previous evaluation of systematic reviews of
prognostic factors (1) found limited investigation of the
association between study design characteristics and effect
estimate (42 of 163 reviews reported), and findings were
inconsistent for specific biases. Assessment of potential bi-
ases in prognosis studies included in systematic reviews by
using a domain-based approach will facilitate future meta-
epidemiologic studies to determine the effect of design-
related biases.
Assessment of potential biases is particularly challeng-
ing in observational studies that are designed to investigate
prognostic factors. The refined QUIPS tool is useful and
reliable for systematic reviewers, study authors, and readers
to guide comprehensive assessment of 6 bias domains in
studies of prognostic factors.
From Dalhousie University, Halifax, Nova Scotia, Canada; Arthritis Re-
search UK Primary Care Centre, Primary Care and Health Sciences,
Keele University, Staffordshire, United Kingdom; University of Ontario
Institute of Technology, Oshawa, Ontario, Canada; and University of
Toronto and Institute for Work & Health, Toronto, Ontario, Canada.
Acknowledgment: The authors thank the QUIPS-Low Back Pain
Working Group members (2006 and 2007) for their important contri-
butions. They also thank the prognosis systematic review authors who
completed their survey and review authors who responded to their addi-
tional questions and requests for data, including Amika Singh, James
Chalmers, Roger Chou, Fiona Clay, Hanneke Creemers, Lotte Dyhrberg
O’Neill, Jan Hartvigsen, Ross Iles, David Jimenez, Sindhu Johnson,
Bindee Kuriya, Jolanda Luime, Veronique Moulaert, Tinca Polderman,
Cara Wasywich, Stephen Wilton, Susan Woolfenden, Lexie Wright, and
Christina Wyatt.
Research and Reporting MethodsAssessing Bias in Studies of Prognostic Factors
www.annals.org 19 February 2013 Annals of Internal Medicine Volume 158 • Number 4 285
Downloaded From: http://annals.org/ by a Capital Hlth Halifax Infirmary User on 02/19/2013
Financial Support: Dr. Hayden received infrastructure funding through
the Nova Scotia Cochrane Resource Centre provided by the Nova Scotia
Health Research Foundation and holds a Research Professorship in Ep-
idemiology funded by the Canadian Chiropractic Research Foundation
and Dalhousie University. Dr. van der Windt is a member of the Prog-
nosis Research Strategy Initiative Medical Research Council, Prognosis
Research Strategy Initiative Partnership (G0902393/99558).
Potential Conflicts of Interest: Disclosures can be viewed at www
.acponline.org/authors/icmje/ConflictOfInterestForms.do?msNumM12
-1871.
Requests for Single Reprints: Jill A. Hayden, DC, PhD, Department of
Community Health & Epidemiology, Dalhousie University, 5790 Uni-
versity Avenue, Room 222, Halifax, Nova Scotia B3H 1V7, Canada;
e-mail, jhayden@dal.ca.
Current author addresses and author contributions are available at
www.annals.org.
References
1. Hayden JA, Coˆte´ P, Bombardier C. Evaluation of the quality of prognosis
studies in systematic reviews. Ann Intern Med. 2006;144:427-37. [PMID:
16549855]
2. Hemingway H, Riley RD, Altman DG. Ten steps towards improving prog-
nosis research. BMJ. 2009;339:b4184. [PMID: 20042483]
3. Hayden JA, Chou R, Hogg-Johnson S, Bombardier C. Systematic reviews of
low back pain prognosis had variable methods and results: guidance for future
prognosis reviews. J Clin Epidemiol. 2009;62:781-796.e1. [PMID: 19136234]
4. Riley RD, Sauerbrei W, Altman DG. Prognostic markers in cancer: the
evolution of evidence from single studies to meta-analysis, and beyond. Br J
Cancer. 2009;100:1219-29. [PMID: 19367280]
5. Kamper SJ, Hancock MJ, Maher CG. Optimal designs for prediction studies
of whiplash. Spine (Phila Pa 1976). 2011;36:S268-74. [PMID: 22020594]
6. Hemingway H, Philipson P, Chen R, Fitzpatrick NK, Damant J, Shipley M,
et al. Evaluating the quality of research into a single prognostic biomarker: a
systematic review and meta-analysis of 83 studies of C-reactive protein in stable
coronary artery disease. PLoS Med. 2010;7:e1000286. [PMID: 20532236]
7. Hayden JA, Coˆte´ P, Steenstra IA, Bombardier C; QUIPS-LBP Working
Group. Identifying phases of investigation helps planning, appraising, and apply-
ing the results of explanatory prognosis studies. J Clin Epidemiol. 2008;61:552-
60. [PMID: 18471659]
8. Jones J, Hunter D. Consensus methods for medical and health services re-
search. BMJ. 1995;311:376-80. [PMID: 7640549]
9. Hayden JA. Methodological Issues in Systematic Reviews of Prognosis and
Prognostic Factors: Low Back Pain. Toronto: Univ Toronto; 2007.
10. Chapple CM, Nicholson H, Baxter GD, Abbott JH. Patient characteristics
that predict progression of knee osteoarthritis: a systematic review of prognostic
studies. Arthritis Care Res (Hoboken). 2011;63:1115-25. [PMID: 21560257]
11. Pickett CA, Jackson JL, Hemann BA, Atwood JE. Carotid bruits as a prog-
nostic indicator of cardiovascular death and myocardial infarction: a meta-
analysis. Lancet. 2008;371:1587-94. [PMID: 18468542]
12. Pickett CA, Jackson JL, Hemann BA, Atwood JE. Carotid bruits and cere-
brovascular disease risk: a meta-analysis. Stroke. 2010;41:2295-302. [PMID:
20724720]
13. Palmer SC, Hayen A, Macaskill P, Pellegrini F, Craig JC, Elder GJ, et al.
Serum levels of phosphorus, parathyroid hormone, and calcium and risks of death
and cardiovascular disease in individuals with chronic kidney disease: a systematic
review and meta-analysis. JAMA. 2011;305:1119-27. [PMID: 21406649]
14. Mathew A, Devereaux PJ, O’Hare A, Tonelli M, Thiessen-Philbrook H,
Nevis IF, et al. Chronic kidney disease and postoperative mortality: a systematic
review and meta-analysis. Kidney Int. 2008;73:1069-81. [PMID: 18288098]
15. Higgins JP, Altman DG, Gøtzsche PC, Ju¨ni P, Moher D, Oxman AD, et
al; Cochrane Bias Methods Group. The Cochrane Collaboration’s tool for as-
sessing risk of bias in randomised trials. BMJ. 2011;343:d5928. [PMID:
22008217]
16. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB,
et al; QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment
of diagnostic accuracy studies. Ann Intern Med. 2011;155:529-36. [PMID:
22007046]
17. Singh AS, Mulder C, Twisk JW, van Mechelen W, Chinapaw MJ. Track-
ing of childhood overweight into adulthood: a systematic review of the literature.
Obes Rev. 2008;9:474-88. [PMID: 18331423]
18. Moulaert VR, Verbunt JA, van Heugten CM, Wade DT. Cognitive impair-
ments in survivors of out-of-hospital cardiac arrest: a systematic review. Resusci-
tation. 2009;80:297-305. [PMID: 19117659]
19. van Drongelen A, Boot CR, Merkus SL, Smid T, van der Beek AJ. The
effects of shift work on body weight change—a systematic review of longitudinal
studies. Scand J Work Environ Health. 2011;37:263-75. [PMID: 21243319]
20. Clay FJ, Newstead SV, McClure RJ. A systematic review of early prognostic
factors for return to work following acute orthopaedic trauma. Injury. 2010;41:
787-803. [PMID: 20435304]
21. Nijrolder I, van der Horst H, van der Windt D. Prognosis of fatigue.
A systematic review. J Psychosom Res. 2008;64:335-49. [PMID: 18374732]
22. Proper KI, Singh AS, van Mechelen W, Chinapaw MJ. Sedentary behaviors
and health outcomes among adults: a systematic review of prospective studies.
Am J Prev Med. 2011;40:174-82. [PMID: 21238866]
23. Spee LA, Madderom MB, Pijpers M, van Leeuwen Y, Berger MY. Associ-
ation between helicobacter pylori and gastrointestinal symptoms in children. Pe-
diatrics. 2010;125:e651-69. [PMID: 20156901]
24. van Duijvenbode DC, Hoozemans MJ, van Poppel MN, Proper KI. The
relationship between overweight and obesity, and sick leave: a systematic review.
Int J Obes (Lond). 2009;33:807-16. [PMID: 19528969]
25. Jime´nez D, Uresandi F, Otero R, Lobo JL, Monreal M, Martı´D,etal.
Troponin-based risk stratification of patients with acute nonmassive pulmonary
embolism: systematic review and metaanalysis. Chest. 2009;136:974-82. [PMID:
19465511]
26. Wright AA, Cook C, Abbott JH. Variables associated with the progression of
hip osteoarthritis: a systematic review. Arthritis Rheum. 2009;61:925-36.
[PMID: 19565541]
27. Elshout G, Monteny M, van der Wouden JC, Koes BW, Berger MY.
Duration of fever and serious bacterial infections in children: a systematic review.
BMC Fam Pract. 2011;12:33. [PMID: 21575193]
28. Gieteling MJ, Bierma-Zeinstra SM, Lisman-van Leeuwen Y, Passchier J,
Berger MY. Prognostic factors for persistence of chronic abdominal pain in chil-
dren. J Pediatr Gastroenterol Nutr. 2011;52:154-61. [PMID: 21057328]
29. Jeejeebhoy FM, Zelop CM, Windrim R, Carvalho JC, Dorian P, Morrison
LJ. Management of cardiac arrest in pregnancy: a systematic review. Resuscita-
tion. 2011;82:801-9. [PMID: 21549495]
30. Johnson SR, Swiston JR, Swinton JR, Granton JT. Prognostic factors for
survival in scleroderma associated pulmonary arterial hypertension. J Rheumatol.
2008;35:1584-90. [PMID: 18597400]
31. McShane LM, Altman DG, Sauerbrei W, Taube SE, Gion M, Clark GM;
Statistics Subcommittee of NCI-EORTC Working Group on Cancer Diagnos-
tics. REporting recommendations for tumor MARKer prognostic studies
(REMARK). Breast Cancer Res Treat. 2006;100:229-35. [PMID: 16932852]
32. National Institute for Health and Clinical Excellence. Appendix J: method-
ology checklist: prognostic studies. In: The Guidelines Manual, London: Na-
tional Institute for Health and Clinical Excellence; 2009:218-22.
33. Shamliyan T, Kane RL, Jansen S. Systematic reviews synthesized evidence
without consistent quality assessment of primary studies examining epidemiology
of chronic diseases. J Clin Epidemiol. 2012;65:610-8. [PMID: 22424987]
34. Sanderson S, Tatt ID, Higgins JP. Tools for assessing quality and suscepti-
bility to bias in observational studies in epidemiology: a systematic review and
annotated bibliography. Int J Epidemiol. 2007;36:666-76. [PMID: 17470488]
35. Hartling L, Hamm M, Milne A, Vandermeer B, Santaguida PL, Ansari M,
Tsertsvadze A, et al. Validity and Inter-Rater Reliability Testing of Quality As-
sessment Instruments. Rockville, MD: Agency for Healthcare Research and
Quality; 2012. Accessed at www.ncbi.nlm.nih.gov/books/NBK92293/ on 20
November 2012.
Research and Reporting Methods Assessing Bias in Studies of Prognostic Factors
286 19 February 2013 Annals of Internal Medicine Volume 158 • Number 4 www.annals.org
Downloaded From: http://annals.org/ by a Capital Hlth Halifax Infirmary User on 02/19/2013
Current Author Addresses: Dr. Hayden: Department of Community
Health & Epidemiology, Dalhousie University, 5790 University Avenue,
Room 222, Halifax, Nova Scotia B3H 1V7, Canada.
Dr. van der Windt: Arthritis Research UK Primary Care Centre, Primary
Care Sciences, Keele University, Staffordshire ST5 5BG, United
Kingdom.
Ms. Cartwright: Department of Community Health & Epidemiology,
Dalhousie University, 5790 University Avenue, Room 228, Halifax,
Nova Scotia B3H 1V7, Canada.
Dr. Coˆte´: Faculty of Health Sciences, University of Ontario Institute of
Technology, 2000 Simcoe Street North, Oshawa, Ontario L1H 7K4,
Canada.
Dr. Bombardier: Toronto General Hospital, Eaton North Wing, 6th
Floor, Room 231A, 200 Elizabeth Street, Toronto, Ontario M5G 2C4,
Canada.
Author Contributions: Conception and design: J.A. Hayden, D.A. van
der Windt, P. Coˆte´.
Analysis and interpretation of the data: J.A. Hayden, D.A. van der
Windt, J.L. Cartwright, P. Coˆte´, C. Bombardier.
Drafting of the article: J.A. Hayden, J.L. Cartwright, P. Coˆte´.
Critical revision of the article for important intellectual content: J.A.
Hayden, D.A. van der Windt, J.L. Cartwright, P. Coˆte´, C. Bombardier.
Final approval of the article: J.A. Hayden, D.A. van der Windt, J.L.
Cartwright, P. Coˆte´, C. Bombardier.
Collection and assembly of data: J.A. Hayden, D.A. van der Windt, J.L.
Cartwright.
Annals of Internal Medicine
www.annals.org 19 February 2013 Annals of Internal Medicine Volume 158 • Number 4 W-143
Downloaded From: http://annals.org/ by a Capital Hlth Halifax Infirmary User on 02/19/2013
Appendix Table 1. Description of Experience of Review
Teams Conducting Risk of Bias Assessment by Using the
QUIPS Tool*
Characteristic of Critical Appraisal Review Teams,
n
Number of reviewers involved in conducting the critical
appraisal
13
231
3or4 7
Process used for critical appraisal
Single reviewer 2
Single reviewer with checking by a second reviewer 3
Independent evaluation by 2 reviewers with consensus 33
Independent evaluation by 2 reviewers with consensus 3
Other 1
Ease of reaching consensus on assessments
Very easy 6
Easy 22
Neutral 7
Hard 3
Time to complete critical appraisal of each study
Median time (range) 20 (5–90) min
10 min 36
20 min 23
1h 5
Training or education to complete the critical appraisal
No 24
Yes 17
QUIPSQuality In Prognosis Studies.
*Total number of review teams is 43.
Where multiple choices were possible or questions have been skipped without
providing an answer, the number of review teams may not always sum to 43.
Appendix Table 2. Description of How the QUIPS Tool Was
Used by Review Teams*
Question Review
Teams,
n
Number of QUIPS potential bias domains assessed
All 6 29
59
44
QUIPS bias domains assessed
Study participation 42
Study attrition 37
Prognostic factor measurement 41
Outcome measurement 41
Study confounding 35
Statistical analysis and reporting 39
How prompting items were used
All prompting items were scored 24
Prompting items were used to guide judgments only 15
Other 1
How ratings of risk of bias for each domain were determined
Count of items satisfied/not satisfied or algorithm to
combine items
15
Overall judgment 16
Other 3
How the overall risk of bias of each study was rated
Count or score of individual prompting items 13
Count or score of risk of bias domain assessments 7
Overall judgment 13
Overall quality of each study was not assessed 9
Presentation of critical appraisal results for studies included
in review
Reported ratings for individual items for each included
study
20
Reported each risk of bias domain assessment for each
included study
9
Reported an assessment of quality for each included study 20
Reported an overall assessment of quality across all
included studies
12
No presentation of critical appraisal results for included
studies
2
Use of the results of the critical appraisal in synthesizing
review evidence
Described the results of the quality assessment for all
studies
24
Used the quality assessment items or score as
inclusion/exclusion criteria
3
Used a quality score to define the level of study quality or
to rank studies
19
Tested the association of potential biases and study results‡ 5
Not used in synthesis 6
QUIPSQuality In Prognosis Studies.
*Total number of review teams is 43.
Where multiple choices were possible or questions have been skipped without
providing an answer, the number of review teams may not always sum to 43.
Using subgroup or metaregression analyses.
W-144 19 February 2013 Annals of Internal Medicine Volume 158 • Number 4 www.annals.org
Downloaded From: http://annals.org/ by a Capital Hlth Halifax Infirmary User on 02/19/2013
... Data extraction of numeric values was conducted independently by two investigators, and the source document was checked by a third reviewer for any discrepancies. Risk of bias assessment of individual studies was performed using the quality in prognostic studies (QUIPS) tool (Supporting Information S1: Appendix C) [23]. ...
... Risk of bias was assessed using the QUIPS tool [23]. Most of the included studies had a high or moderate risk of bias due to lack of reporting, specifically in the "Study Attrition" and "Statistical Analysis and Reporting" domains. ...
Article
Full-text available
Objectives The objective of this systematic literature review (SLR) combined with expert clinical review was to identify and rank prognostic factors and effect measure modifiers (EMMs) systematically and comprehensively in patients with relapsed or refractory (R/R) diffuse large B‐cell lymphoma (DLBCL) who initiate treatment after ≥ 2 prior lines of therapy (LoTs; 3L+ R/R DLBCL). Methods We performed an SLR of studies published between 2016 and 2021 and extracted study characteristics, prognostic factors, and EMMs. This was followed by clinical review and ranking of findings by subject matter experts using questionnaires, follow‐up interviews, and quantitative ranking. Results Across 46 included studies, the SLR identified 36 prognostic factors significantly associated with ≥ 1 clinical outcome. Based on subject matter expert ranking of the SLR‐derived list, the five most important prognostic variables in descending order are: early chemo‐immunotherapy failure, Eastern Cooperative Oncology Group performance status, refractory to last LoT, number of prior LoTs, and double‐ or triple‐hit lymphoma. Conclusions This SLR and expert clinical review is the first to provide a comprehensive assessment of prognostic factors for 3L+ R/R DLBCL. No statistically significant EMMs were identified. This robust multi‐method approach can assist in selecting prognostic variables for comparative analyses between real‐world studies and clinical trials.
... Studies were judged on their methodological quality independently by both reviewers (BP and ASP) using the Quality In Prognosis Studies (QUIPS) tool [14,38]. The QUIPS tool is a validated instrument for assessing risk of bias in prognostic factor studies, with reported moderate to substantial inter-rater reliability [38]. ...
... Studies were judged on their methodological quality independently by both reviewers (BP and ASP) using the Quality In Prognosis Studies (QUIPS) tool [14,38]. The QUIPS tool is a validated instrument for assessing risk of bias in prognostic factor studies, with reported moderate to substantial inter-rater reliability [38]. Risk of bias was assessed in six domains: study participation (selection bias, D1), bias due to attrition (D2), prognostic factor (D3) or outcome measurement (D4), adjustment for confounding variables (D5) and clarity of the statistical analysis and reporting (D6). ...
Article
Full-text available
Objective Systematically review and critically appraise the literature on the association between peripartum fetal Doppler sonography findings, i.e., acquired upon admission for spontaneous or induced labor, and perinatal outcome in term (37-42w) pregnancies. Methods Medline, Embase, Web of Science, Cochrane Library, and clinicaltrials.gov databases were systematically searched from inception to 05/2024. Studies conducted in unselected populations of term (37-42w) pregnancies, admitted for spontaneous or induced labor, reporting the association between fetal Doppler findings and perinatal outcome, were eligible for inclusion. Study eligibility was assessed independently by two reviewers. Methodological quality was assessed using the Quality In Prognosis Studies (QUIPS)-tool. Effect estimates were pooled using random-effects meta-analyses. Summary Odds Ratios (ORs) and Mean Differences (MDs) are reported with 95% confidence intervals. Results Thirty-seven studies, reporting on 11.505 women and neonates, were included. Fourteen studies reported on findings from the umbilical artery (UA), four on the middle cerebral artery (MCA), five on the umbilical vein (UV), and nine on the cerebroplacental ratio (CPR). An abnormal UA Doppler and CPR increased the odds of fetal distress (FD) during labor (UA: OR 3.67 [1.14, 11.78], I² = 72% – CPR: OR 3.19 [2.68, 3.80], I² = 0%) and subsequent operative delivery (ODFD) (UA: OR 3.65 [1.66, 8.04], I² = 81% – CPR: OR 2.48 [1.66, 3.70], I² = 57%). Likewise, the presence of UV pulsations was strongly associated with both outcomes (FD: OR 28.78 [11.21, 73.87], I² = 0% – ODFD: OR 303.36 [11.11, 8279.82], I² = 0%). Regarding neonatal outcome, an Apgar-score < 7 at 5 min and NICU admission occurred more frequently if Doppler findings were abnormal in the UA (Apgar: OR 3.65 [1.82, 7.34], I² = 0% – NICU: OR 3.92 [2.36, 6.51], I² = 0%), or in case of an abnormal CPR (Apgar: OR 3.64 [2.03, 6.54], I² = 0% – NICU: OR 2.71 [1.15, 6.38], I² = 0%). Neonatal birthweight was also lower in the presence of an abnormal UA or CPR result, with a MD of -630.61g ([-1234.29, -26.93], I² = 80%) and -146.52g ([-285.03, -8.01], I² = 0%) respectively. Most studies (70.3%) were at high risk of bias on one or more domains; only 11 studies had an overall low risk of bias score. Conclusion Doppler sonography in the peripartum period allows for the identification of fetuses at risk of adverse birth outcomes. Further research on optimal thresholds to define at-risk cases and subsequent management strategies is needed. PROSPERO registration number CRD42023413264.
... To further evaluate the risk of bias in prognostic factor studies, the Quality In Prognosis Studies (QUIPS) tool was employed and its results are displayed next to the forest plots [18]. QUIPS examines six domains: (a) Study Participation, (b) Study Attrition, (c) Prognostic Factor Measurement, (d) Outcome Measurement, (e) Study Confounding and (f) Statistical Analysis and Reporting. ...
Article
Full-text available
Objective: Premature placental calcification (PPC) has been implicated in adverse perinatal outcomes, yet its clinical significance remains controversial. This meta-analysis aimed to quantitatively synthesize current data on the association between PPC, defined as grade 3 placental calcification before 36+6 weeks of gestation and adverse perinatal outcomes. Data Sources: A systematic search was conducted in MEDLINE, Scopus and The Cochrane Library from inception until 11 March 2025, to identify eligible studies. Study Eligibility Criteria: Observational studies including singleton pregnancies with PPC diagnosed via ultrasonography between 28+0 and 36+6 weeks of gestation and comparing them with pregnancies with Grannum grade 0, 1, or 2 placentas were considered eligible. Methods: Study quality was assessed using the Newcastle−Ottawa Scale, and the risk of bias was evaluated with the Quality In Prognosis Studies tool. The primary outcomes were small-for-gestational-age (SGA) neonates and preeclampsia. Heterogeneity was assessed using Cochran’s Q test and the I2 statistic. Meta-analyses were conducted using a random-effects model, with outcomes reported as relative risk (RR) or mean difference (MD) with 95% confidence intervals (CIs). Results: In total, nine cohort studies were included. PPC was associated with an increased risk of SGA (RR, 1.99; 95% CI, 1.46−2.70), preeclampsia (RR, 5.27; 95% CI, 2.24−12.40), fetal growth restriction (RR, 2.31; 95% CI, 1.30−4.09), preterm delivery (RR, 2.11; 95% CI, 1.00−4.45), suspected fetal hypoxia (RR, 1.71; 95% CI, 1.13–2.56), low 5 min Apgar score (RR, 2.28; 95% CI, 1.50−3.44) and neonatal intensive care unit admission (RR, 1.80; 95% CI, 1.02−3.18). No significant associations were found with fetal or neonatal death (RR, 2.75; 95% CI, 0.87−8.71), cesarean delivery (RR, 1.26; 95% CI, 0.90−1.78), gestational diabetes mellitus (RR, 1.17; 95% CI, 0.81−1.70), neonatal resuscitation (RR, 1.04; 95% CI, 0.92−1.16), birthweight (MD, −187.46 g; 95% CI, −413.14 to +38.21), or gestational age at birth (MD, −0.62 weeks; 95% CI, −1.36 to +0.11). A sensitivity analysis excluding high-risk-of-bias studies yielded consistent results. Conclusions: PPC is associated with several adverse perinatal outcomes, including SGA and preeclampsia. While the clinical significance of placental grading has remained limited in recent years, this study has shown that PPC may serve as an early indicator of placental insufficiency, warranting enhanced fetal surveillance and risk assessment in affected pregnancies. Further research is needed to refine its prognostic utility and integration into obstetric practice.
... Patients were prospectively and systematically enrolled among those addressing intensive inpatient post-stroke rehabilitation and were consistently provided a previously defined evidence-based rehabilitation pathway throughout the involved IRUs. Indeed, the prospective databases used both RIPS and STRATEGY to fulfill the criteria of high quality in prognostic studies 57 , for all areas defining the risk of bias: participation, attrition, prognostic factor measurement, confounding measurement and account, outcome measurement, and analysis and reporting. Moreover, the number of patients included in our analyses is indeed a considerable effort towards a fully representative selection (85%, Fig. 2). ...
Article
Full-text available
An accurate and reliable functional prognosis is vital to stroke patients addressing rehabilitation, to their families, and healthcare providers. This study aimed at developing and validating externally patient-wise prognostic models of the global functional outcome at discharge from intensive inpatient post-acute rehabilitation after stroke, based on a standardized comprehensive multidimensional assessment performed at admission to rehabilitation. Patients addressing intensive inpatient rehabilitation pathways within 30 days from stroke were prospectively enrolled in two consecutive multisite studies. Demographics, description of the event, clinical/functional, and psycho-social data were collected. The outcome of interest was disability in basic daily living activities at discharge, measured by the modified Barthel Index (mBI). Machine learning-based prognostic models were developed, internally cross-validated, and externally validated. Interpretability techniques were applied for the analysis of predictors. 385 patients were considered, 220 (165) for training (external test) sets. A 50.9% (55.8%) of women, 79.5% (80.0%) of ischemic, and a median [interquartile range- IQR] age of 80.0[15.0] (79.0[17.0]) were registered. The Support Vector Machine obtained the best validation performances and a median absolute error [IQR] on discharge mBI estimation of 11.5[15.0] and 9.2[13.0] points on the internal and external testing, respectively. The baseline variables providing the main contributions to the predictions were mBI, motor upper-limb score, age, and cognitive screening score. We achieved a solution to support the formulation of a functional prognosis at intensive rehabilitation admission. The interpretability analysis confirms the relevance of easily collected motor and cognitive dataat admission and of the patient’s age. Trial registration: Prospectively registered on ClinicalTrials.gov (registration numbers RIPS NCT03866057, STRATEGY NCT05389878).
Article
Background Risk factors for venous thromboembolism (VTE) and their relative magnitudes across different phases of care in inflammatory bowel disease (IBD) are poorly understood. Therefore, we performed a systematic review to identify risk factors for VTE in patients with IBD during the hospitalized, post-operative, post-discharge, and ambulatory phases of care. Methods MEDLINE, EMBASE, and Cochrane CENTRAL were systematically searched from inception through to April 2024 without language restriction. We included studies that reported risk factors for VTE among adults with IBD. Summary estimates with 95% confidence intervals (CIs) were calculated for individual risk factors overall and stratified by phase of care using random effects models. Results A total of 123 studies with over 23 510 969 patients were analyzed. We identified 48 variables for meta-analysis overall and 27 were significantly associated with VTE. The strongest risk factors were prior VTE (odds ratio [OR], 4.44; 95% CI, 2.63-7.49), surgical complications (OR, 3.06; 95% CI, 2.48-3.77), urgent surgery (OR, 2.33; 95% CI, 1.62-3.35), blood transfusions (OR, 2.68; 95% CI, 1.17-6.12), hypoalbuminemia (OR, 2.25; 95% CI, 1.93-2.62), and total parenteral nutrition (OR, 2.21; 95% CI, 1.85-2.64). Corticosteroids (OR, 1.60; 95% CI, 1.46-1.76) but not anti-tumor necrosis factor therapy (OR, 0.66; 95% CI, 0.46-0.97) were associated with an increased risk of VTE. No major differences were observed for most variables between hospitalized, post-operative, and post-discharge settings. Conclusions We identified multiple risk factors associated with VTE across different phases of care. This work will help in the development of future predictive models to guide thromboprophylaxis in IBD.
Article
Introduction Pre-eclampsia (PE) remains a major contributor to maternal morbidity and mortality globally. Early identification of risk factors and evaluation of prognostic models for severe adverse maternal outcomes are essential for improving management and reducing complications. While numerous studies have explored potential risk markers, there is still no consensus on the most reliable factors and models to use in clinical practice. This systematic review aims to consolidate research on both individual predictors and prognostic models of severe adverse maternal outcomes in PE, providing a comprehensive overview to support better clinical decision-making and patient care. Methods and analysis This review follows the Meta-analyses Of Observational Studies in Epidemiology (MOOSE) guidelines and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Protocol 2015 checklist. A systematic search will be performed using a detailed strategy across Medline, Embase, Cochrane, ProQuest dissertations, and grey literature from inception to 2 April 2024. Eligible studies will include those investigating clinical, laboratory-based, and sociodemographic predictors of severe adverse maternal outcomes in PE. Two reviewers will independently assess titles, abstracts, full texts, and extract data and assess study quality using the Quality In Prognostic Studies (QUIPS) tool for studies on risk predictors and the Prediction model Risk of Bias Assessment Tool (PROBAST) for prognostic models. The inclusion criteria will encompass cohort, case-control, and cross-sectional studies published in English and French involving women diagnosed with PE and reporting on the risk prediction for adverse maternal outcomes. The main outcomes of interest will include severe maternal morbidity and mortality during pregnancy, delivery, or within the postpartum period. Analyses will include both narrative synthesis and, where appropriate, meta-analysis using random-effects models. Pooled estimates will be calculated, with publication bias assessed through funnel plots and statistical tests (eg, Begg’s and Egger’s). Heterogeneity will be primarily assessed through visual inspection of forest plots, supported by statistical measures, such as the I² test, with further exploration through sensitivity, subgroup, and meta-regression analyses. Ethics and dissemination This systematic review will be based on published data and will not require ethics approval. Results will be disseminated through peer-reviewed publications and presentations at academic conferences. PROSPERO registration number CRD42024517097.
Article
Full-text available
Flaws in the design, conduct, analysis, and reporting of randomised trials can cause the effect of an intervention to be underestimated or overestimated. The Cochrane Collaboration’s tool for assessing risk of bias aims to make the process clearer and more accurate
Article
Full-text available
In 2003, the QUADAS tool for systematic reviews of diagnostic accuracy studies was developed. Experience, anecdotal reports, and feedback suggested areas for improvement; therefore, QUADAS-2 was developed. This tool comprises 4 domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first 3 domains are also assessed in terms of concerns regarding applicability. Signalling questions are included to help judge risk of bias. The QUADAS-2 tool is applied in 4 phases: summarize the review question, tailor the tool and produce review-specific guidance, construct a flow diagram for the primary study, and judge bias and applicability. This tool will allow for more transparent rating of bias and applicability of primary diagnostic accuracy studies.
Article
Full-text available
Parents of febrile children frequently contact primary care. Longer duration of fever has been related to increased risk for serious bacterial infections (SBI). However, the evidence for this association remains controversial. We assessed the predictive value of duration of fever for SBI. Studies from MEDLINE, Embase and Cochrane databases (from January 1991 to December 2009) were retrieved. We included studies describing children aged 2 months to 6 years in countries with high Haemophilus influenzae type b vaccination coverage. Duration of fever had to be studied as a predictor for serious bacterial infections. Seven studies assessed the association between duration of fever and serious bacterial infections; three of these found a relationship. The predictive value of duration of fever for identifying serious bacterial infections in children remains inconclusive. None of these seven studies was performed in primary care. Studies evaluating the duration of fever and its predictive value in children in primary care are required.
Article
Full-text available
Clinical practice guidelines on the management of mineral and bone disorders due to chronic kidney disease recommend specific treatment target levels for serum phosphorus, parathyroid hormone, and calcium. To assess the quality of evidence for the association between levels of serum phosphorus, parathyroid hormone, and calcium and risks of death, cardiovascular mortality, and nonfatal cardiovascular events in individuals with chronic kidney disease. The databases of MEDLINE (1948 to December 2010) and EMBASE (1947 to December 2010) were searched without language restriction. Hand searches also were conducted of the reference lists of primary studies, review articles, and clinical guidelines along with full-text review of any citation that appeared relevant. Of 8380 citations identified in the original search, 47 cohort studies (N = 327,644 patients) met the inclusion criteria. The characteristics of study design, participants, exposures, and covariates together with the outcomes of all-cause mortality, cardiovascular mortality, and nonfatal cardiovascular events at different levels of serum phosphorus, parathyroid hormone, and calcium were analyzed within studies. Data were summarized across studies (when possible) using random-effects meta-regression. The risk of death increased 18% for every 1-mg/dL increase in serum phosphorus (relative risk [RR], 1.18 [95% confidence interval {CI}, 1.12-1.25]). There was no significant association between all-cause mortality and serum level of parathyroid hormone (RR per 100-pg/mL increase, 1.01 [95% CI, 1.00-1.02]) or serum level of calcium (RR per 1-mg/dL increase, 1.08 [95% CI, 1.00-1.16]). Data for the association between serum level of phosphorus, parathyroid hormone, and calcium and cardiovascular death were each available in only 1 adequately adjusted cohort study. Lack of adjustment for confounding variables was not a major limitation of the available studies. The evidentiary basis for a strong, consistent, and independent association between serum levels of calcium and parathyroid hormone and the risk of death and cardiovascular events in chronic kidney disease is poor. There appears to be an association between higher serum levels of phosphorus and mortality in this population.
Article
Health providers face the problem of trying to make decisions in situations where there is insufficient information and also where there is an overload of (often contradictory) information. Statistical methods such as meta-analysis have been developed to summarise and to resolve inconsistencies in study findings-where information is available in an appropriate form. Consensus methods provide another means of synthesising information, but are liable to use a wider range of information than is common in statistical methods, and where published information is inadequate or non-existent these methods provide a means of harnessing the insights of appropriate experts to enable decisions to be made. Two consensus methods commonly adopted in medical, nursing, and health services research-the Delphi process and the nominal group technique (also known as the expert panel)-are described, together with the most appropriate situations for using them; an outline of the process involved in undertaking a study using each method is supplemented by illustrations of the authors' work. Key methodological issues in using the methods are discussed, along with the distinct contribution of consensus methods as aids to decision making, both in clinical practice and in health service development.
Article
In 2003, the QUADAS tool for systematic reviews of diagnostic accuracy studies was developed. Experience, anecdotal reports, and feedback suggested areas for improvement; therefore, QUADAS-2 was developed. This tool comprises 4 domains: patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first 3 domains are also assessed in terms of concerns regarding applicability. Signalling questions are included to help judge risk of bias. The QUADAS-2 tool is applied in 4 phases: summarize the review question, tailor the tool and produce review-specific guidance, construct a flow diagram for the primary study, and judge bias and applicability. This tool will allow for more transparent rating of bias and applicability of primary diagnostic accuracy studies.
Article
To evaluate how systematic reviews assess the quality of primary studies of incidence, prevalence, or risk factors for chronic diseases. We searched several databases, identified 145 systematic reviews, and evaluated methods of quality assessment and quantitative synthesis of evidence by external or internal validity or overall quality of primary studies. Of 145 reviews, 54 (37%) reported a planned quality assessment of primary studies with checklists or scales and 26 (18%) reported evaluation of some selected quality criteria. Thirty-nine percent of reviews judged appropriateness of sampling and proper controls for confounding factors in primary studies. Twelve percent synthesized evidence by overall quality, 17% by design, 42% by criteria of internal validity, and 24% by external validity of primary studies. Masking of quality assessment was conducted on 2.1% of reviews and 4% tested interobserver agreement for quality assessment. Evaluation of internal and external validity of primary studies is uncommon in systematic reviews of studies of incidence, prevalence, or risk factors for chronic diseases. Inconsistent quality assessment practices reflect the absence of uniformly accepted standards and tools to examine the quality of observational nontherapeutic studies.
Article
Commentary. To provide guidance for the design and interpretation of predictive studies of whiplash associated disorders (WAD). Numerous studies have sought to define and explain the clinical course and response to treatment of people with WAD. Design of these studies is often suboptimal, which can lead to biased findings and issues with interpreting the results. Literature review and commentary. Predictive studies can be grouped into four broad categories; studies of symptomatic course, studies that aim to identify factors that predict outcome, studies that aim to isolate variables that are causally responsible for outcome, and studies that aim to identify patients who respond best to particular treatments. Although the specific research question will determine the optimal methods, there are a number of generic features that should be incorporated into design of such studies. The aim of these features is to minimize bias, generate adequately precise prognostic estimates, and ensure generalizability of the findings. This paper provides a summary of important considerations in the design, conduct, and reporting of prediction studies in the field of whiplash.
Article
To identify, by systematic review, patient characteristics that can be used by health care practitioners to predict the likelihood of knee osteoarthritis (OA) progression. A search was conducted of the electronic databases Medline, EMBase, CINAHL, AMED, and Web of Science in November 2010. Two reviewers screened articles using inclusion/exclusion criteria. Study participants were adults with established knee OA. Outcome measures for disease progression were change in pain or function or deterioration in radiographic features. Included studies identified clinically relevant prognostic factors at baseline and reported a statistical association with outcome. Minimum followup was 1 year. Articles were assessed for bias, and strength of evidence was summarized for potential predictors of progression. Thirty studies were included, of which 26 were of high quality. Age, varus knee alignment, presence of OA in multiple joints, and radiographic features had strong evidence as predictors of knee OA progression. Body mass index was a strong predictor for long-term progression (>3 years). Moderate participation in physical activity was not associated with progression. Numerous variables had limited or conflicting evidence. Relatively few predictive variables have strong supporting evidence; numerous variables have limited or conflicting evidence. All variables with strong evidence can be easily evaluated and utilized in clinical practice. Existing knowledge should be developed in future research, particularly in cases where study numbers are low or findings are limited or conflicting. Standardized measurement of potential predictors and outcome measures is recommended.
Article
To describe the consensus on science pertaining to resuscitation of the pregnant patient. Systematic review. EMBASE, Ovid MEDLINE, Evidence Based Reviews, American Heart Association library and bibliographies of selected articles. The following inclusion criteria were used: pregnancy and cardiac arrest out of hospital, pregnancy and cardiac arrest in hospital, cardiovascular, respiratory, fetal survival, and pharmacology as they relate to cardiac arrest and resuscitation. Non-English papers, case reports and reviews were excluded. Studies were selected through an independent review of titles, abstracts and full article. Two reviewers independently graded the methodological quality of selected articles. 1305 articles were identified and 5 were selected for further review. There were no randomized trials and overall the quality of the selected studies was good. Two studies examined chest compressions on a manikin in left lateral tilt from the horizontal and concluded that although feasible with increasing degrees of tilt forcefulness of the chest compressions decreases. The third study observed the transthoracic impedance was not altered during pregnancy. One case series and one retrospective cohort study reviewed perimortem cesarean section. Both reports concluded that perimortem cesarean section is rarely done within the recommended time frame of 5 min after the onset of maternal cardiac arrest. Usual defibrillation dosages are likely appropriate in pregnancy. Perimortem cesarean section is an intervention which is rarely done within 5 min to optimize maternal salvage from cardiac arrest. Chest compressions in left lateral tilt are less forceful compared to the supine position.