ArticlePDF AvailableLiterature Review

Comparison of the psychometric properties of health-related quality of life measures used in adults with systemic lupus erythematosus: A review of the literature

Authors:

Abstract and Figures

Objective: A review of the literature was undertaken to evaluate the development and psychometric properties of health-related quality of life (HRQoL) measures used in adults with SLE. This information will help clinicians make an informed choice about the measures most appropriate for research and clinical practice. Methods: Using the key words lupus and quality of life, full original papers in English were identified from six databases: OVID MEDLINE, EMBASE, Allied and Complementary Medicine, Psychinfo, Web of Science and Health and Psychosocial Instruments. Only studies describing the validation of HRQoL measures in adult SLE patients were retrieved. Results: Thirteen papers were relevant; five evaluated generic instruments [QOLS-S (n = 1), EQ-5D/SF-6D (n = 1), SF-36 (n = 3)] and eight evaluated disease-specific measures [L-QOL (n = 1), LupusQoL (UK) (n = 1), LupusQoL (US) (n = 1), SSC (n = 2), SLEQOL (n = 3)]. For the generic measures, there is moderate evidence of good content validity and internal consistency, whereas there is strong evidence for both these psychometric properties in disease-specific measures. There is limited to moderate evidence to support the construct validity and test-retest reliability for the disease-specific measures. Responsiveness and floor/ceiling effects have not been adequately investigated in any of the measures. Conclusions: Direct comparison of the psychometric properties was difficult because of the different methodologies employed in the development and evaluation of the different HRQoL measures. However, there is supportive evidence that multidimensional disease-specific measures are the most suitable in terms of content and internal reliability for use in studies of adult patients with SLE.
Content may be subject to copyright.
Original article
Comparison of the psychometric properties of
health-related quality of life measures used in
adults with systemic lupus erythematosus: a review
of the literature
Madhura Castelino
1
, Janice Abbott
2
, Kathleen McElhone
1
and Lee-Suan Teh
1
Abstract
Objective. A review of the literature was undertaken to evaluate the development and psychometric
properties of health-related quality of life (HRQoL) measures used in adults with SLE. This information
will help clinicians make an informed choice about the measures most appropriate for research and
clinical practice.
Methods. Using the key words lupus and quality of life, full original papers in English were identified from
six databases: OVID MEDLINE, EMBASE, Allied and Complementary Medicine, Psychinfo, Web of Science
and Health and Psychosocial Instruments. Only studies describing the validation of HRQoL measures in
adult SLE patients were retrieved.
Results. Thirteen papers were relevant; five evaluated generic instruments [QOLS-S (n = 1), EQ-5D/SF-6D
(n = 1), SF-36 (n = 3)] and eight evaluated disease-specific measures [L-QOL (n = 1), LupusQoL (UK) (n = 1),
LupusQoL (US) (n = 1), SSC (n = 2), SLEQOL (n = 3)]. For the generic measures, there is moderate evidence
of good content validity and internal consistency, whereas there is strong evidence for both these psy-
chometric properties in disease-specific measures. There is limited to moderate evidence to support the
construct validity and testretest reliability for the disease-specific measures. Responsiveness and floor/
ceiling effects have not been adequately investigated in any of the measures.
Conclusions. Direct comparison of the psychometric properties was difficult because of the different
methodologies employed in the development and evaluation of the different HRQoL measures.
However, there is supportive evidence that multidimensional disease-specific measures are the most
suitable in terms of content and internal reliability for use in studies of adult patients with SLE.
Key words: quality of life, development, validation, systemic lupus erythematosus.
Introduction
SLE is a chronic inflammatory autoimmune disorder
with variable multi-system involvement that affects
primarily young women. The varied manifestations, the
unpredictable relapsingremitting course of the disease,
side effects of potentially toxic treatments and poor
understanding of the condition by the general public all
have an impact on patients, leading to dissatisfaction
in various domains of their life [1]. Improvement in survival
[2] has not reflected a similar improvement in the quality
of life [3] for SLE patients. As this condition affects a rela-
tively younger age group, with subsequently longer dis-
ease duration, the clinical manifestations may have
far-reaching psychological and social consequences [4].
Objective assessments of disease activity and damage
are gauged by the clinician and do not capture the
patient’s perspective of their health [3]. Therefore, more
recently it has been emphasized that patient-reported in-
struments such as those measuring health-related quality
of life (HRQoL) should be one of the outcome measures in
1
Rheumatology Department, Royal Blackburn Hospital, Blackburn and
2
School of Psychology, University of Central Lancashire, Preston, UK.
Correspondence to: Lee-Suan Teh, Department of Rheumatology,
Administration Block, Level 1, Royal Blackburn Hospital, Haslingden
Road, Blackburn BB2 3HH, UK. E-mail: lsteh@btinternet.com
Submitted 15 May 2012; revised version accepted 29 October 2012.
!
The Author 2012. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com
RHEUMATOLOGY
Rheumatology 2013;52:684696
doi:10.1093/rheumatology/kes370
Advance Access publication 22 December 2012
CLINICAL
SCIENCE
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
clinical trials [5]. This has been advocated by both the US
Food and Drug Administration (FDA) [6] and the European
Medicines Agency (EMA) [7].
HRQoL measures, in the form of questionnaires, have
been either developed exclusively for use in SLE
(disease-specific measures) or have been used in SLE pa-
tients but developed for evaluation of quality of life in any
disease state or healthy individuals (generic measures).
An instrument with good psychometric properties will
be able to determine the HRQoL of the patient more
accurately than one without. Knowledge of acceptable
psychometric standards, the conceptual framework and
appraisal of the developmental process of HRQoL tools
would help determine the adequacy of the measure for
clinical use, for example, to determine the effectiveness
of an intervention.
The aim of this work was to compare the psychometric
properties of all published HRQoL measures (generic
and disease specific) that have been developed and/or
evaluated for use in adults with SLE. This should provide
valuable information to clinicians on the appropriateness
of the instrument for measuring HRQoL in their clinical
practice as well as in research studies.
Materials and methods
Search strategy
A literature search was carried out using the keywords
lupus and quality of life. The search was limited to full
papers in the English language and those pertaining
to adult patients. The following databases were searched
up to November 2010: OVID MEDLINE (from 1950),
EMBASE (from 1980), Health and Psychosocial
Instruments (from 1985), Allied and Complementary
Medicine (from 1985), Psychinfo, PubMed and Web of
Science.
The papers were assessed by all the authors based on
the eligibility criteria as defined below:
Inclusion criteria were papers that described the meth-
odology of the development and validation of HRQoL
measures in SLE; linguistic translation and evaluation of
an existing HRQoL measure and papers that primarily
evaluated the measures for SLE patients.
Exclusion criteria were inadequate numbers of SLE
patients recruited for the evaluation (Fig. 1) and HRQoL
measures published only as abstracts.
Papers that fulfilled any one of the inclusion criteria
in the absence of the exclusion criteria were included in
this review.
After accounting for duplicates, a total of 374 papers
were identified, and after reviewing titles and abstracts,
20 full papers were identified as possibly suitable for this
systematic review (Fig. 1). References of these papers
were also screened for additional relevant papers and
no further papers were identified. All 20 papers were
read by all the authors independently, and using the eligi-
bility criteria, 13 were identified as suitable for inclusion in
this systematic review. The reasons for exclusion of the
seven papers [814] are explained in Fig. 1.
Data extraction and quality assessments
Data extraction was carried out independently by all the
authors. Demographic and clinical data, and information
on the description of the instruments and their psycho-
metric properties were extracted. The demographic and
clinical data included were age, gender, disease duration,
disease activity and damage. Descriptive information of
the scales included the number of items, domains,
ranges of score, mode of administration, time to adminis-
ter and recall period. The psychometric properties
extracted were validity: content, construct (convergent
and divergent), concurrent (criterion) and cross-cultural;
reliability: internal and testretest; responsiveness and
the floor/ceiling effects. A template was used to assess
the psychometric properties of the instrument and the
quality of the methodology used to determine that prop-
erty. The template was based on both the numerical
criteria for the qualitative evaluation of an HRQoL instru-
ment proposed by Terwee et al. [15] and the consensus
published as the original Consensus Based Standards for
the Selection of Health Measurement Instruments
(COSMIN) checklist [1619].
Scoring criteria for evaluation of psychometric
properties
The strength of the evidence was rated both for a psycho-
metric property and the robustness of the methodology
used to determine that property as follows: (i) strong evi-
dence to support the property and the robustness of the
methodology used to evaluate it was schematically rated
as three pluses, +++; (ii) moderate evidence was denoted
by two pluses, ++; (iii) limited evidence, one plus, +; (iv) no
evidence, a minus, ; (v) if interpretation was difficult, a
question mark,?; or (vi) not assessed by NA. Any discre-
pancies in the scores for the measurement properties
were discussed by all the authors and agreement was
achieved based on the available information.
Psychometric properties
Psychometric properties of a measurement instrument are
broadly classified into three domains: validity, reliability
and responsiveness. The validity of the instrument in-
cludes evaluation of content, construct (convergent/diver-
gent) and criterion validity. Content validity ensures that
the measure is sensible, relevant and comprehensively
covers all aspects of the condition assessed. In this
study, content validity was rated positive if there was
involvement of experts (doctors, nurses and social scien-
tists) as well as patients at the stage of questionnaire
development. Construct validity evaluates the robustness
of the structure and determines the subscales of the
questionnaire. Convergent validity was judged to be
adequately demonstrated if there was high positive cor-
relations between scales and divergent validity, if correl-
ations were low or if they were negative. Assessment of
the instrument against the true value or against a gold
standard is termed concurrent (criterion) validity. A posi-
tive rating was given if convincing arguments were pre-
sented that the comparator questionnaire really was the
www.rheumatology.oxfordjournals.org 685
HRQoL measures used in adults with SLE
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
FIG.1Flow chart of the process for review of the literature.
e
-
eight
p
S
p
686 www.rheumatology.oxfordjournals.org
Madhura Castelino et al.
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
gold standard and the correlation was 50.7. Reliability
assesses the reproducibility and consistency of an instru-
ment. Internal reliability or internal consistency measures
the extent to which items within a subscale are concep-
tually related and the acceptable statistical value is
Cronbach’s a 50.7 [15]. Testretest reliability measures
the stability of a questionnaire and is gauged by the
intraclass correlation coefficient (ICC). An ICC >0.7 is
considered adequate [15]. Responsiveness is the ability
of the scale to detect changes within the same patient
in longitudinal studies. The instrument was considered
to have floor or ceiling effects if >15% of respondents
scored at the extreme ends of the scale. The
generalizability of the questionnaire was assessed by es-
tablishing if the study population was adequately
described to help clinicians extrapolate the results to
their respective patient cohorts.
Results
Measures included in the review
The measures reviewed can be subdivided into generic or
disease-specific HRQoL measures. The generic measures
used in SLE were the Medical Outcome Survey Short
Form-36 (MOS SF-36 version 1) [2022], Quality of Life
Scale, Swedish version (QOLS-S) [23], Short Form-6D
(SF-6D) [24] and EuroQoL-5D (EQ-5D) [24]. These four
measures were evaluated in five studies, one for QOLS-
S, three on the SF-36 and one study utilizing both the
SF-6D and the EuroQoL-5D. The disease-specific meas-
ures used in SLE were the Systemic Lupus Erythematosus
Symptom Checklist (SSC) [25, 26], Systemic Lupus
Erythematosus-Specific Quality of Life instrument
(SLEQOL) [2729], LupusQoL [30, 31] and L-QoL [32].
The SSC, SLEQOL and LupusQoL have all undergone
cross-cultural evaluation [26, 28, 29, 31].
Description of patients
The demographic data of the patients are tabulated in
Table 1. The gender distribution in the studies reflects
the incidence of disease in both genders. Exceptions
were the QOLS-S and the developmental phases of
both the original LupusQoL [referred to as LupusQoL
(UK)] and L-QoL in that only females participated. For
the instruments developed in Europe, the age distribution,
disease duration and disease activity of the samples were
similar. The studies within the Chinese population
included younger patients with less disease activity and
damage and shorter disease duration. For ethnicity distri-
bution, some authors [22, 24, 32] did not mention the
ethnic profile of the samples.
Description of the questionnaires
This is summarized in supplementary Table S1, available
as supplementary data at Rheumatology Online. All meas-
ures except the SSC and L-QOL have multiple domains.
All the questionnaires were developed in English-speaking
populations except the QOLS-S (Swedish) and the SSC
(Dutch). Cross-cultural validation was undertaken for the
SF-36 (to Chinese), SSC (to Brazilian Portuguese),
SLEQOL (to Brazilian Portuguese) and LupusQoL (UK)
(to US English).
The number of items/response options, scoring range/
interpretation, time for administration and recall time for
each measure are also summarized in supplementary
Table S1, available as supplementary data at
Rheumatology Online. The response options varied in
the different measures with most questionnaires using a
5-point or 7-point Likert scale [33]. For the SF-36, SF-6D,
QOLS-S, EQ-5D and LupusQoL, higher scores reflected
better health, while for SLEQOL, SSC and L-QOL the re-
verse was true. The time for administration of the ques-
tionnaires, when stated, was <10 min for all measures.
The measures had varying numbers of items (540) and
recall periods (present time to the previous 4 weeks and
up to the previous year for the general health item in the
SF-36) with the most typical recall period being the previ-
ous 4 weeks.
Psychometric properties of the questionnaire
The psychometric properties tabulated in Tables 2 and 3
include validity (content, criterion and construct), reliability
(internal consistency and testretest reliability) and re-
sponsiveness. These properties are further described/
analysed in the following paragraphs.
Content validity
Qualitative interviews or other ways of involving patients
and experts (rheumatologists and nurse clinicians) were
an essential part of the development of the disease-spe-
cific measures and the QOLS-S. In the case of generic
measures, the content validity was assumed for an SLE
population.
Construct validity
The construct validity was evaluated using different meth-
ods and statistical analyses in the different papers. Factor
analysis carried out for the SF-36 failed to sufficiently sup-
port the proposed domains in the studies [21, 22].
QOLS-S used the consensus model, whereas SF-6D
and EQ-5D had no factor analysis done. In comparison,
the factor analyses for all of the disease-specific meas-
ures except the SLEQOL confirmed the dimensionality
and the domain structure. The developers of SLEQOL
based the final subsections on convenience.
Correlation coefficients were used to determine conver-
gent and divergent validity to test a priori hypotheses.
Convergent validity was evaluated against different meas-
ures in the various studies. This was not assessed ad-
equately for the SF-36, SF-6D and EQ-5D. The QOLS-S
was compared against the Arthritis Impact Measurement
Scale, with moderate correlation between the psycho-
logical score and the QOLS-S for SLE patients. All the
disease-specific measures used the SF-36 as a compara-
tor measure for convergent validity, except the L-QoL,
which used the Nottingham Health Profile (NHP). The
LupusQoL (US) used the EQ-5D in addition to the
SF-36. Moderate to strong correlations were noted for
www.rheumatology.oxfordjournals.org 687
HRQoL measures used in adults with SLE
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
TABLE 1 Demographic data of the studies included in this systematic review
Measure
First author and
year [ref] Number of patients in the studies
Female,
n (%)
Age, mean
(S.D.), years
Disease duration,
mean (S.D.), years
Disease activity,
median (range)
Disease damage,
median (range)
SF-36 v1 Stoll 1997 [20] n = 150 143 (95) 40 (11.0) 10 (7) BILAG 5 (37) SLICC 1 (02)
Thumboo 1999 [21] n = 118 112 (94.5) NS 3.61 (0.0116.1)
a
BILAG 2 (015) SLICC 0 (08)
Thumboo 2000 [22] n = 69 61 (88.4) NS 4.7 (0.125.5)
b
BILAG 3 (012) SLICC 0 (06)
QOLS-S Burckhardt 1992 [23] n = 50 50 (100) 43.5 (12.7) 13.9 (10.2) NS NS
EQ-5D/SF-6D Aggarwal 2009 [24] n = 167 156 (93.5) 42.5 (13.0) 9.3 (8.8) SLEDAI 6.2 (5.7)
c
SLICC 2 (2)
b
SSC Grootscholten 2003 [25] Group I: n = 87 82 (94) 32.5 (2071)
b
8(130)
b
SLEDAI 4 (16) NS
Group II: n = 33 29 (88) 37.0 (1864)
b
9.2 (126)
b
SLEDAI 4 (16) NS
Freire 2007 [26] n = 50 44 (93) 34.2 (12.0) 6.5 (7.4) SLEDAI2K 7.2 (4.7)
c
NS
SLEQOL Leong 2005 [27] Initial group: n = 100 89 (89) 39.4 (13.7) NS NS NS
Test group: n = 275 248 (90.5) 40.1 (13.4) NS SLEDAI 2.74 (4.82)
c
SLICC 0.67 (06)
Kong 2007 [28] n = 237 213 (89.9) 47.63 (11.91) NS NS NS
Freire 2010 [29] n = 107 106 (99.5) 36.8 (12) 5.9 (5.6) NS NS
LupusQoL McElhone 2007 [30] Interview: n = 30 30 (100) 48.1 (13.1) 9.2 (8.4) NS NS
Patient feedback on draft: n = 20 20 (100) 52.0 (15.2) 11.2 (6.1) NS NS
Psychometric testing version 1: n = 322 299 (93) 45.1 (13.4) NS NS NS
Psychometric testing version 2: n = 215 206 (96) 46.2 (13.3) NS NS NS
Psychometric testing version 3: n = 160;
(postal survey n = 115)
152 (95) 45.3 (13.9) NS NS NS
Jolly 2010 [31] n = 205 (complete data n = 186) NS (94) 42.5 (12.9) NS SLEDAI 4 (027)
[6.2 (5.8)
c
]
SLICC 1 (010)
[2.0 (2.1)
c
]
L-QoL Doward 2009 [32] Qualitative interviews: n = 50 47 (94.0) 42.6 (13.4) 8 (0.336)
b
NS NS
Field test interviews: n = 16 16 (100) 48.7 (11.1) 9.5 (122)
b
NS NS
Postal survey 1: n = 95 90 (94.7) 45.3 (15.0) 7 (150)
b
NS NS
Postal survey 2: n = 93 91 (97.8) 43.9 (12.1) 14 (137)
b
NS NS
a
Mean (range) as published;
b
median (range);
c
mean (S.D.). NS: not stated.
688 www.rheumatology.oxfordjournals.org
Madhura Castelino et al.
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
TABLE 2 Psychometric qualities of SLE measures
Instruments Content validity Construct validity
a
Concurrent criterion
validity
SF-36 (UK) Assumed for all
disease states
No factor analysis SF-20+ (SF-20 with
fatigue as gold
standard
Eight domains assumed
Hypotheses: known groups validity
Discriminant validity: levels of disease severity—BILAG Correlation [0.69
(0.890.31)]Divergent validity: 0.40 to 0.27 with BILAG 0.30 to 0.04 with SLICC/ACR-DI
Convergent validity with SF-20+: 0.310.89
SF-36 (Singapore) Assumed for all
disease states
Factor analysis: eight domains loading onto four factors No gold standard
(i) Physical functioning
(ii) Physical and emotional role functioning and bodily pain
(iii) Mental health, vitality and social functioning
(iv) General health
Hypotheses: divergent validity: 0.37 to 0.09 with disease activity (BILAG) and 0.25 to 0.16 with damage
(SLICC/ACR DI)
Chinese SF-36 Assumed for all
disease states
Factor analysis: four domains (physical functioning, role physical, social functioning, bodily pain) loaded onto
one factor; the other four domains loaded onto at least two factors.
No gold standard
Hypotheses: known groups validity
Divergent validity: 0.34 to 0.17 with disease activity (BILAG) and 0.35 to 0.19 with damage (SLICC/ACR DI)
QOLS-S Assumed for all
chronic diseases
[35] + additional
item (independ-
ence) [1]
No factor analysis during the original development. Consensus model—used for categorizing the items based
on original five factors item to scale correlation (t
1
= 0.210.64; t
2
= 0.440.70)
No gold standard
On recent analysis: three-factor structure [34]
Hypotheses: divergent validity: 0.00 to 0.63 with VAS-pain, Ritchie Articular Index, patient SLAM and some
of the AIMS subscales.
SF-6D Assumed No factor analysis SF-36 as gold
standardHypotheses: discriminant validity: SLEDAI used to differentiate subgroups based on disease severity and
SLICC/ACR DI for levels of damage
Strong correlations
between SF-6D and
SF-36 (0.760.57)
and PCS (0.72) but
not MCS (0.30)
Divergent validity: 0.23 with SLEDAI and 0.22 with SLICC/ACR DI
EQ-5D Not assessed No factor analysis No gold standard
Likely to have
assumed
Hypotheses: discriminant validity: based on disease severity (SLEDAI) and damage (SLICC/ACR DI) convergent
validity: 0.69 to 0.55 with corresponding domains of SF-36
Divergent validity: 0.49 to 0.24 with non-corresponding domains of SF-36
0.16 to 0.21 with SLEDAI and 0.21 to 0.20 with SLICC/ACR DI
(continued)
www.rheumatology.oxfordjournals.org 689
HRQoL measures used in adults with SLE
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
TABLE 2 Continued
Instruments Content validity Construct validity
a
Concurrent criterion
validity
SSC Derived from Dutch
literature; phys-
ician and patient
involvement to
add items
Exploratory factor analysis—unidimensional scale No gold standard
Hypotheses:
Correlations: 0.44 to 0.66 with SF-36, 0.26 with SLEDAI and 0.54 to 0.69 with IRGL and POMS
SLEQOL Nomination by ex-
perts;
patients-ascertai-
ned importance
and relevance of
items and item
addition (n = 100)
Factor analysis: eight factors No gold standard
Divided into six subsections for convenience
Rasch analysis: to ascertain the item difficulty
Hypotheses: not stated
Convergent validity: 0.0300.171 with SF-36, RAI Helplessness subscale
Divergent validity: 0.0030.091 with SLEDAI, SLAM, SLICC/ACR-DI
LupusQoL (UK) Expert input and
patient
semi-structured
qualitative inter-
views (n = 30)
Exploratory principal component factor analysis:eight factors No gold standard
Hypotheses: not stated
Concurrent validity: 0.710.79 with related domains of SF-36
Discriminant validity: BILAG used to differentiate levels of disease severity (seven domains) and SLICC/ACR DI
for damage (five domains).
LupusQoL (US) Assumed based on
original LupusQoL
Factor analysis—exploratory: eight domains loading onto five factors; Confirmatory: five-factor loading LupusQoL (UK) as
gold standardHypotheses: convergent validity: 0.540.73 with related domains of SF-36, 0.50 to 0.68 with EQ-5D
Discriminant validity: SLEDAI used to differentiate levels of disease severity (four domains) and SLICC/ACR DI
for damage (six domains)
L-QoL Themes and items
identified from
qualitative inter-
views reviewed by
patients (n = 50)
Factor analysis—unidimensional scale No gold standard
Hypothesis: convergent validity: moderate correlations with NHP and worse scores for poorer health or more
severe SLE—values not published
a
All values are correlation coefficient r interpreted as follows: >0.6: strong positive correlation; less than 0.6: strong negative correlation; 0.300.59: moderate positive correlation;
0.30 to 0.59: moderate negative correlation; <0.30 to 0: weak positive correlation; greater than 0.30 to 0: weak negative correlation. AIMS: Arthritis Impact Measurement Scale;
IRGL: Influence of Rheumatic Diseases on General Health and Lifestyle; MCS: Mental Component Score; PCS: Physical Component Score; POMS: Profile of Mood States; RAI:
Rheumatology Attitudes Index; SLAM: Systemic Lupus Activity Measure.
690 www.rheumatology.oxfordjournals.org
Madhura Castelino et al.
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
TABLE 3 Further psychometric properties of HRQoL measures used in SLE
Instruments
Internal consistency
[mean (range)]
Testretest reliability
[mean (range)] Responsiveness Floor/ceiling effect Generalizability
SF-36 (UK) Cronbach’s a (n = 150) Not assessed Not assessed Not assessed Good descriptive data of
population sample[0.85 (0.710.95)]
Missing data included in
analysis
SF-36 (Singapore) Cronbach’s a (n = 118) Repeatability coeff.
Bland and Altman
Not assessed Floor effect [19] Good descriptive data of
population sample[0.89 (0.840.94)]
[38.7 (21.169.3)]
Spearman’s rank
correlation
[0.76 (0.670.88)]
Mean: 10.92%
Handling of missing data not
explained
Range: 022.6%
Ceiling effect [19]
Mean: 24.92%
(n = 78) (t =514days)
Range: 0.458.9%
Chinese SF-36 Cronbach’s a (n = 69) Repeatability coeff.
Bland and Altman
Not assessed Not assessed Good descriptive data of
population sample[0.83 (0.720.91)]
[39.19 (21.670.29)] Missing scale items were
substitutedSpearman’s rank correl-
ation [0.81 (0.650.90)]
(n = 47) (t =514 days)
QOLS-S Cronbach’s a Correlation coeff.: 0.86 Not assessed Not assessed Good descriptive data of
population sampletime 1: 0.85 (n = not specified) (t =4
but no data on ethnicitytime 2: 0.91 weeks)
Handling of missing data
explained
SF-6D Domains derived from
SF-36
Not assessed Effect sizes: small 0.040.43
(n
= 66)
Ceiling
effect: 2.6% Good descriptive data of
population sampleFloor effect:0.67%
No factor analysis Sensitive to self-reported improve-
ment and improvement in EQ-5D
VAS
Handling of missing data not
explained
EQ-5D Not assessed Not assessed Effect sizes: small Ceiling effect: 12.7% Good descriptive data of
population sample0.0120.428 (n = 66) Floor effect: none
Sensitive to self-reported
improvement and improvement
in EQ-5D VAS
Handling of missing data not
explained
SSC Unidimensional (n = 87) Pearson’s correlation
coefficient
(n = 17) (t = 1 year) Not assessed Good descriptive data of
population sample but no
data on ethnicity
Cronbach’s a:
(n = 28) (t = 1 month)
change in SSC noted but not in
subjective patient VASSSC: 0.89
SSC: 0.78TDL: 0.89
TDL: 0.87
Missing data stated—
‘almost no missing data’
(continued)
www.rheumatology.oxfordjournals.org 691
HRQoL measures used in adults with SLE
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
TABLE 3 Continued
Instruments
Internal consistency
[mean (range)]
Testretest reliability
[mean (range)] Responsiveness Floor/ceiling effect Generalizability
SLEQOL Cronbach’s a (n = 51) (t = 2 weeks) (n = 115 data pairs from 95 patients) Floor effect: Good descriptive data of
population sampleOverall 0.95 (n = 275) ICC: summary score: 0.83 Mean: 28.8%
Handling of missing data not
explained
Range:14.944%
For subsections
[0.87 (0.760.93)]
ICC for subsections:
[0.63 (0.570.80)]
More sensitive but less specific
than SF-36
Ceiling effect:
Multiple statistical methods used
Mean: 0.57%
Range: 02.6%
LupusQoL (UK) Cronbach’s a (n = 160) (n = 83) (t = 4 weeks) Not assessed 10% floor effects: Good descriptive data of
population sample[0.93 (0.880.96)] ICC [0.83 (0.720.93)] Mean: 5.76%
Missing responses men-
tioned treated as
unanswered
Range: 2.210.8%
10% ceiling effect:
Mean: 17.99%
Range: 6.228.2%
LupusQoL (US) Cronbach’s a (n = 185) (n = 15) (t = 1 week) Not assessed Not assessed Good descriptive data of
population sample[0.91 (0.830.94)] ICC [0.87 (0.680.92)]
Handling of missing data not
explained
L-QoL Unidimensional scale (n = 76) (t = 2 weeks) Not assessed Floor and ceiling effects—
reported as relatively few
scored at the extremes
Good descriptive data of
population sample but no
ethnicity data
Cronbach’s a ICC: 0.95
Time 1: 0.91 (n = 93)
Time 2: 0.92 (n = 76)
Missing data stated as
minimal
n:
number of patients involved; t: time interval; TDL: total distress level.
692 www.rheumatology.oxfordjournals.org
Madhura Castelino et al.
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
the disease-specific measures except for the SLEQOL.
The evaluation of divergent validity and discriminant val-
idity was against the disease activity measures (the BILAG
index or the SLEDAI) and the damage index [SLICC/ACR
Damage Index (SLICC/ACR-DI)] for all studies. Weak or
no correlations supported the divergent validity of both
generic and disease-specific measures. The disease-
specific measures had moderate evidence for construct
validity, but the generic measures had limited evidence
for this.
Concurrent (criterion) validity
The SF-36, SF-6D and LupusQoL (US) were derived from
the SF-20, SF-36 and LupusQoL (UK), and therefore had
gold standards for determining the concurrent criterion
validity. The SF-36 and SF-6D showed strong correlation
coefficients in most domains when compared with the ori-
ginal instrument. Concurrent validity was not reported in
the LupusQoL (US), and thus this property could not be
adequately explored for this instrument.
Internal consistency
There was moderate to strong evidence for internal con-
sistency in all the measures except the SF-6D and EQ-5D.
The scoring was evaluated as moderate due to the smaller
sample sizes in the studies for the Chinese SF-36 and the
QOLS-S. For all the scales, the mean internal consistency
(Cronbach’s a) was >0.80. All of the disease-specific
measures had high mean ICC, although the individual do-
mains ranged from 0.57 to 0.93 across measures.
Testretest reliability
Testretest reliability varied for the different domains in the
various measures. Four of the measures [Chinese SF-36,
SSC, QOLS-S and LupusQoL (US)] used sample sizes
of fewer than 50, which is the minimum number recom-
mended for such analyses [15]. For all the instruments the
mean ICC was >0.70. The English and the Chinese ver-
sions of the SF-36 assessed in the Singapore population
employed the Bland and Altman repeatability coefficient.
Two of the domains were found to have large changes,
thus bringing into question the reliability of these domains.
The SLEQOL had good ICC for the overall score on
testretest reliability, but the scores of the subsections
were below the accepted 0.70 in four of the six
domains. The LupusQoL (UK) had strong evidence for
good methodology and ICC, but only two of the eight do-
mains had a sample size of more than 50 patients.
From the data available it is clear that the generic meas-
ures have limited evidence for testretest reliability,
whereas the disease-specific measures appear to fare
marginally better with limited to moderate evidence for
this property.
Responsiveness and floor/ceiling effect
Percentages of floor and ceiling effect were provided for
the EQ-5D, SF-6D, SLEQOL, SF-36 (Singapore) and
LupusQoL (UK). The EQ-5D and SF-6D did not show
either effect, the SLEQOL reported floor effects and the
SF-36 (Singapore) and LupusQoL (UK) were noted to have
ceiling effects in some of the domains. The responsive-
ness or sensitivity to change was assessed in four of the
HRQoL measures (SLEQOL, SSC, EQ-5D and SF-6D).
Although the SSC scores improved statistically, the pa-
tients (on treatment with cyclophosphamide) did not per-
ceive any change as per the patient’s visual analogue
scale (VAS). The authors attributed this to the psycho-
logical adaptation in patients with chronic illness and to
the small sample size. For the EQ-5D and SF-6D, only
small effect sizes were demonstrated.
Table 4 summarizes schematically the level of evidence
for the main psychometric properties of the HRQoL meas-
ures that have been evaluated in this review. There is
strong to moderate evidence for good reliability, internal
TABLE 4 Level of evidence for psychometric properties of HRQoL measures evaluated in patients with SLE
Instrument
Content
validity
Internal
consistency
Testretest
reliability
Construct
validity Responsiveness
Floor/ceiling
effect
SF-36 (UK) NA +++ NA + NA NA
SF-36 (Singapore) NA +++ ? + NA
Chinese SF-36 NA ++ ? + NA NA
QOLS-S ++ ++ ? + NA NA
SF-6D NA NA NA + + ++
EQ-5D NA NA NA + + ++
SSC ++ ++ + ++ NA
SLEQOL +++ +++ ++ ++ +
LupusQoL (UK) +++ +++ ++ ++ NA
LupusQoL (US) NA +++ ? ++ NA NA
L-QoL +++ +++ ++ ++ NA ?
Levels of evidence: +++: strong evidence for measurement property (excellent evidence of methodological quality); ++:
moderate evidence for measurement property (good evidence of methodological quality); +: limited evidence for measurement
property (study of fair methodological quality); : no evidence for measurement property; ?: interpretation difficult (poor
methodological quality); NA: not assessed.
www.rheumatology.oxfordjournals.org 693
HRQoL measures used in adults with SLE
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
consistency and validity for the disease-specific meas-
ures. The internal consistency was strong to moderate
for both generic and disease-specific measures.
Generalizability
The description of the study sample was available in all the
measures except QOLS-S, SSC and L-QOL. This import-
ant omission makes it difficult for the reader to determine
the population for which the measure is best suited.
Discussion
This review highlights some deficiencies in the HRQoL
measures used in SLE patients. Comparison of the stu-
dies was difficult due to the varied methodology employed
by the different authors. This is partly because the studies
were undertaken at different time periods (between 1997
and 2010) and reflected the trend in the development of
quality of life measures over the years. Despite this, all the
HRQoL measures developed/evaluated for use in SLE
patients have moderate to strong evidence for content
validity and internal consistency. The structure of a good
measure should emphasize and evaluate the domains of
importance to the patients. All the disease-specific meas-
ures have addressed this adequately. However, there was
limited evidence for construct validity for the generic
measures.
Only a stable measure can be used with confidence in
clinical studies. Accuracy in determining the changes of
HRQoL in a clinical situation is crucial if the measure has
to evaluate any improvement, deterioration or lack of
change in the patients’ quality of life. Only some of the
disease-specific measures have moderate evidence
for testretest reliability and this was not demonstrated
adequately in the generic measures.
In addition, responsiveness and floor and ceiling effects
were not evaluated in all measures. In those measures
where they were evaluated, there was limited evidence
for responsiveness of the SLEQOL, EQ-5D and SF-6D,
while the evidence for the SSC was inconclusive. For a
clinician to be able to benefit from using an HRQoL meas-
ure it is essential that responsiveness is evaluated, so that
data from interventional studies can be interpreted more
accurately.
The settings in which the surveys were administered
(outpatient clinics, inpatient, postal surveys or a mixture
of these) may have introduced bias. Although this makes it
difficult to compare, it would reflect the bias in subsequent
administration of the measures in similar settings.
Appropriate samples are important in terms of both
composition and size. Sample size calculations and the
reasoning behind the numbers recruited would have
added to the robustness of the studies. Acknowledging
missing data in any study is also important, as it not only
highlights the possible areas of difficulties for the subjects,
but also gives the researchers the opportunity to scrutinize
any deficiencies of the questionnaire. Missing data were
described in all the studies, but how they subsequently
affected the developmental process of each questionnaire
was not described in any of the papers.
The selection of the most appropriate instrument to
use in a study will be determined by the aims of that
study. A unidimensional measure lends itself more to eco-
nomic evaluation. On the other hand, a multidimensional
measure addresses the various aspects vital to the con-
cept of quality of life to an individual. Multidimensional
measures help to identify areas that need to be targeted
for intervention to improve the quality of life for an adult
with SLE.
Guidelines from the FDA and EMA encourage the use
of patient-reported outcome evaluations in studies on
the development of interventions, especially new medica-
tions. The validation of HRQoL instruments is essential
to ascertain that these tools are robust and can thus
be confidently used by clinicians. In this review, it is
evident that the methodologies employed in the process
of HRQoL development have not been uniform
across the measures. The recently published COSMIN
checklist [1619], developed by international consensus,
will hopefully inform future research and lead to a
more uniform approach that would aid comparison.
However, all the measures that have been discussed
in this review were developed prior to the publication of
this guidance.
In conclusion, based on the published studies reviewed,
the disease-specific multidimensional measures have
the strongest evidence for content and construct validity
as well as internal consistency. More studies would be
required to support the stability of these measures and
their sensitivity/responsiveness. If these properties are
supported, it would then make the disease-specific meas-
ures strong contenders for use in clinical practice and
interventional studies in adult SLE populations.
Rheumatology key messages
. Stronger evidence exists for reliability and validity
of SLE disease-specific HRQoL measures than for
generic measures.
. HRQoL measures used in SLE should be evaluated
for responsiveness to aid clinical interpretation.
Disclosure statement: The authors have declared no
conflicts of interest.
Supplementary data
Supplementary data are available at Rheumatology
Online.
References
1 Archenholtz B, Burckhardt CS, Segesten K. Quality of life
of women with systemic lupus erythematosus or
rheumatoid arthritis: domains of importance and dissatis-
faction. Qual Life Res 1999;8:4116.
2 Urowitz MB, Gladman DD, Tom BDM, Ibanez D,
Farewell VT. Changing patterns in mortality and disease
694 www.rheumatology.oxfordjournals.org
Madhura Castelino et al.
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
outcomes for patients with systemic lupus erythematosus.
J Rheumatol 2008;35:21528.
3 McElhone K, Abbott J, Teh L-S. A review of health related
quality of life in systemic lupus erythematosus. Lupus
2006;15:6334.
4 Seawell AH, Danoff-Burg S. Psychosocial research on
systemic lupus erythematosus: a literature review. Lupus
2004;13:8919.
5 Strand V, Gladman D, Isenberg D, Petri M, Smolen J,
Tugwell P. Endpoints: consensus recommendations from
OMERACT IV. Lupus 2000;9:3227.
6 Guidance for industry patient-reported outcome
measures: use in medical product development to support
labeling claims. 2009. http://www.fda.gov/downloads/
Drugs/GuidanceComplianceRegulatoryInformation/
Guidances/UCM193282.pdf (29 November 2011, date last
accessed).
7 Reflection paper on the regulatory guidance for the use of
health related quality of life (HRQL) measures in the
evaluation of medicinal products. 2005. http://www.ispor.
org/workpaper/EMEA-HRQL-Guidance.pdf (29 November
2011, date last accessed).
8 Moore AD, Clare AE, Danoff DS et al. Can health utility
measures be used in lupus research? A comparative
validation and reliability study of 4 utility indices.
J Rheumatol 1999;26:128590.
9 Gladman DD, Urowitz MB, Ong A et al. A comparison
of five health status instruments in patients with systemic
lupus erythematosus (SLE). Lupus 1996;5:1905.
10 Luo N, Chew LH, Fong KY et al. Validity and reliability of
the EQ-5D self-report questionnaire in Chinese-speaking
patients with rheumatic diseases in Singapore. Ann Acad
Med Singapore 2003;32:68590.
11 Luo N, Chew LH, Fong KY et al. Validity and reliability of
the EQ-5D self-report questionnaire in English-speaking
Asian patients with rheumatic diseases in Singapore. Qual
Life Res 2003;12:8792.
12 Ariza-Ariza R, Hernandez-Cruz B, Navarro-Sarabia F.
EuroQol is a useful instrument for assesing the
health-related quality of life of the patients with systemic
lupus erythematosus. Lupus 2005;14:3345.
13 Thumboo J, Fong KY, Ng TP et al. Initial construct
cross-cultural validation of the Short Form 36 for quality of
life assessment of systemic lupus erythematosus patients
in Singapore. Ann Acad Med Singapore 1997;26:2824.
14 Rood MJ, Borggreve SE, Huizinga TWJ. Sensitivity to
change of the MOS SF-36 quality of life assessment
questionnaire in patients with systemic lupus erythema-
tosus taking immunosuppressive therapy. J Rheumatol
2000;27:20579.
15 Terwee CB, Bot SD, de Boer MR et al . Quality criteria were
proposed for measurement properties of health status
questionnaires. J Clin Epidemiol 2007;60:3442.
16 Mokkink LB, Terwee CB, Patrick DL et al. The COSMIN
study reached international consensus on taxonomy, ter-
minology, and definitions of measurement properties for
health-related patient-reported outcomes. J Clin
Epidemiol 2010;63:73745.
17 Mokkink LB, Terwee CB, Patrick DL et al. The COSMIN
checklist for assessing the methodological quality of
studies on measurement properties of health status
measurement instrument: an international Delphi study.
Qual Life Res 2010;19:53949.
18 Mokkink LB, Terwee CB, Knol DL et al. The
COSMIN checklist for evaluating the methodological
quality of studies on measurement properties: a clarifica-
tion of its content. BMC Med Res Methodol 2010;18:
10
22.
19
Terwee CB, Mokkink LB, Knol DL, Ostelo RW, Bouter LM,
de Vet HC. Rating the methodological quality in system-
atic reviews of studies on measurement properties: a
scoring system for the COSMIN checklist. Qual Life Res
2011. http://www.springerlink.com/content/
4555127664034526/fulltext.pdf (29 November 2011, date
last accessed).
20 Stoll T, Gordon C, Seifert B et al. Consistency and validity
of patient administered assessment of quality of life by the
MOS SF-36; its association with disease activity and
damage in patients with systemic lupus erythematosus.
J Rheumatol 1997;24:160814.
21 Thumboo J, Fong KY, Leong KH, Feng PH, Thio ST,
Boey ML. Validation of the MOS SF-36 for quality of life
assessment of patients with systemic lupus
erythematosus in Singapore. J Rheumatol 1999;26:
97102.
22 Thumboo J, Feng PH, Boey ML, Soh CH, Thio S, Fong KY.
Validation of the Chinese SF-36 for quality of life assess-
ment in patients with systemic lupus erythematosus.
Lupus 2000;9:70812.
23 Burckhardt CS, Archenholtz B, Bjelle A. Measuring
the quality of life of women with rheumatoid arthritis or
systemic lupus erythematosus: a Swedish version of the
quality of life scale (QOLS). Scand J Rheumatol 1992;21:
1905.
24 Aggarwal R, Wilke CT, Pickard AS et al. Psychometric
properties of EuroQol 5D and Short Form 6D in patients
with SLE. J Rheumatol 2009;36:120916.
25 Grootscholten C, Ligtenberg G, Derksen RH et al. Health
related quality of life in systemic lupus erythematosus:
development and validation of a lupus specific symptom
checklist. Qual Life Res 2003;12:63544.
26 Freire EA, Guimaraes E, Maia I, Ciconelli RM. Systemic
lupus erythematosus symptom checklist: cross-cultural
adaptation to Brazilian Portuguese language and reliability
evaluation. Acta Reumatol Port 2007;32:3414.
27 Leong KP, Kong KO, Thong BY et al. Development and
preliminary validation of a systemic lupus
erythematosus-specific quality-of-life instrument
(SLEQOL). Rheumatology 2005;44:126776.
28 Kong KO, Ho HJ, Thong BY et al. Cross-cultural adapta-
tion of the Systemic Lupus Erythematosus Quality of Life
Questionnaire into Chinese. Arthritis Rheum 2007;57:
9805.
29 Freire EA, Bruscato A, Leite DR, Sousa TT, Ciconelli RM.
Translation into Brazilian Portuguese, cultural adaptation
and validation of the systemic lupus erythematosus quality
of life questionnaire (SLEQOL). Acta Reumatol Port 2010;
35:3349.
30 McElhone K, Abbott J, Shelmerdine J et al.
Development and validation of a disease-specific
health-related quality of life measure, the LupusQol, for
www.rheumatology.oxfordjournals.org 695
HRQoL measures used in adults with SLE
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
adults with systemic lupus erythematosus. Arthritis Care
Res 2007;57:9729.
31 Jolly M, Pickard AS, Wilke C et al. Lupus-specific health
outcome measure for US patients: the LupusQoL-US
version. Ann Rheum Dis 2010;69:2933.
32 Doward LC, McKenna SP, Whalley D et al. The develop-
ment of the L-QoL: a quality-of-life instrument specific to
systemic lupus erythematosus. Ann Rheum Dis 2009;68:
196200.
33 Likert R. A technique for the development of attitude
scales. Educ Psychol Measure 1952;12:3135.
34 Burckhardt CS, Anderson KL, Archenholtz B, Hagg O.
The Flanagan Quality of Life Scale: evidence of
construct validity. Health Qual Life Outcomes 2003;
1:59.
35 Burckhardt CS, Woods SL, Schultz AA, Ziebarth DM.
Quality of life of adults with chronic illness: a psychometric
study. Res Nurs Health 1989;12:34754.
696 www.rheumatology.oxfordjournals.org
Madhura Castelino et al.
at centlancs1 on April 24, 2014http://rheumatology.oxfordjournals.org/Downloaded from
... As PubMed provided original, revised, and secondary testing publications, supplementary databases were not deemed necessary. PRO review articles also were scanned to ensure all relevant publications were assessed [19][20][21][27][28][29][30]. The methodological analysis sought information describing the methods and processes employed for instrument development and testing of the original and revised lupus instruments. ...
Article
Full-text available
Background The 2009 Food and Drug Administration (FDA) patient-reported outcome (PRO) guidance outlines characteristics of rigorous PRO-measure development. There are a number of widely used PRO measures for Systemic Lupus Erythematosus (SLE), but it is unknown how well the development processes of SLE PRO measures align with FDA guidance; including updated versions. The objective of this study was to assess how well the LupusQoL and LupusPRO, and corresponding updated versions, LupusQoL-US and LupusPROv1.8, align with Food and Drug Administration (FDA) 2009 patient-reported outcome (PRO) guidance. Methods LupusQoL and LupusPRO were selected as the most widely studied and used Lupus PROs in the UK and US. Original (LupusQoL (2007) and LupusQoL-US (2010)) and revised (LupusPROVv1.7 (2012) and LupusPROv1.8 (2018)) versions were reviewed. We used FDA PRO guidance to create evaluation criteria for key components: target population, concepts measured, measurement properties, documentation across the phases of content validity (item-generation and cognitive interviewing, separately) and other psychometric-property testing. Two reviewers abstracted data independently, compared results, and resolved discrepancies. Results For all measures, the target population was unclear as population characteristics (e.g., ethnicity, education, disease severity) varied, and/or were not consistently reported or not considered across the three phases (e.g., LupusQoL item-generation lacked male involvement, LupusPRO cognitive-interviewing population characteristics were not reported). The item-generation phase for both original measures was conducted with concepts elicited via patient-engagement interviews and item derivation from experts. Cognitive interviewing was conducted via patient feedback with limited item-tracking for original measures. In contrast, the revised measures assumed content validity. Other psychometric testing recommendations (reliability, construct validity, ability to detect change) were reported for both original and revised measures, except for ability to detect change for revised measures. Conclusions The SLE PRO measures adhere to some but not all FDA PRO guidance recommendations. Limitations in processes and documentation of the study population, make it unclear for which target population(s) the current Lupus measures are fit-for-purpose.
... Floor or ceiling effects were considered meaningful when >15% of respondents scored at extremes. 30 Median time to completion for each PROMIS domain and the legacy instruments were displayed in seconds and were used as an indicator of instrument feasibility or usability. ...
Article
Full-text available
Background The evaluation of Patient Reported Outcomes Measurement Information System (PROMIS) computerized adaptive test (CAT) in adults with systemic lupus erythematous (SLE) is an emerging field of research. We aimed to examine the test–retest reliability and construct validity of the PROMIS CAT in a Canadian cohort of patients with SLE. Methods Two hundred twenty-seven patients completed 14 domains of PROMIS CAT and seven legacy instruments during their clinical visits. Test–retest reliability of PROMIS was evaluated 7–10 days from baseline using intraclass correlation coefficient (ICC (2; 1)). The construct validity of the PROMIS CAT domains was evaluated against the commonly used legacy instruments, and also in comparison to disease activity and disease damage using Spearman correlations. A multitrait-multimethod matrix (MMM) approach was used to further assess construct validity comparing selected 10 domains of PROMIS and SF-36 domains. Results Moderate to excellent reliability was found for all domains (ICC [2;1] ranging from lowest, 0.66 for Sleep Disturbance and highest, 0.93 for the Mobility domain). Comparing seven legacy instruments with 14 domains of PROMIS CAT, moderate to strong correlations (0.51–0.91) were identified. The average time to complete all PROMIS CAT domains was 11.7 min. The MMM further established construct validity by showing moderate to strong correlations (0.55–0.87) between select PROMIS and SF-36 domains; the average correlations from similar traits (convergent validity) were significantly greater than the average correlations from different traits. Conclusions These results provide evidence on the reliability and validity of PROMIS CAT in SLE in a Canadian cohort.
... Disease-specific quality of life measures for SLE comprise domains that are not captured by generic outcome measures explicitly, such as fatigue. 51 Empirical research to estimate mapping algorithms between disease-specific and generic quality of life measures may be valuable to improve the estimates of health-related quality of life in future modelbased economic evaluations in SLE. 52 National decision-makers provide adoption and research recommendations conditional on subgroupspecific estimates of cost-effectiveness regularly. ...
Article
Full-text available
This study aimed to understand and appraise the approaches taken to handle the complexities of a multisystem disease in published decision-analytic model-based economic evaluations of treatments for SLE. A systematic review was conducted to identify all published model-based economic evaluations of treatments for SLE. Treatments that were considered for inclusion comprised antimalarial agents, immunosuppressive therapies, and biologics including rituximab and belimumab. Medline and Embase were searched electronically from inception until September 2018. Titles and abstracts were screened against the inclusion criteria by two reviewers; agreement between reviewers was calculated according to Cohen’s κ. Predefined data extraction tables were used to extract the key features, structural assumptions and data sources of input parameters from each economic evaluation. The completeness of reporting for the methods of each economic evaluation was appraised according to the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) statement. Six decision-analytic model-based economic evaluations were identified. The studies included azathioprine (n=4), mycophenolate mofetil (n=3), cyclophosphamide (n=2) and belimumab (n=1) as relevant comparator treatments; no economic evaluation estimated the relative cost-effectiveness of rituximab. Six items of the CHEERS statement were reported incompletely across the sample: target population, choice of comparators, measurement and valuation of preference-based outcomes, estimation of resource use and costs, choice of model, and the characterisation of heterogeneity. Complexity in the diagnosis, management and progression of disease can make decision-analytic model-based economic evaluations of treatments for SLE a challenge to undertake. The findings from this study can be used to improve the relevance of model-based economic evaluations in SLE and as an agenda for research to inform future health technology assessment and decision-making.
... Notably, a previous study [25] suggested that a ceiling effect of 15% and higher should be considered as 'serious' (as shown for the 3 L version) while relevantly below 15% is considered small (as shown by the 5 L version). Several studies suggested that other HRQoL instruments have shown lower ceiling effects than the EQ-5D while still strongly correlated with the EQ-5D scores, e.g. the SF-6D [26,27]. Also, Round suggests to consider other HRQoL measures instead of EQ-5D [28]. ...
Article
Full-text available
Background: The EuroQoL five-dimensional instrument (EQ-5D) is the favoured preference-based instrument to measure health-related quality of life (HRQoL) in several countries. Two versions of the EQ-5D are available: the 3-level version (EQ-5D-3 L) and the 5-level version (EQ-5D-5 L). This study aims to compare specific measurement properties and scoring of the EQ-5D-3 L (3 L) and EQ-5D-5 L (5 L) in Indonesian type 2 diabetes mellitus (T2DM) outpatients. Methods: A survey was conducted in a hospital and two primary healthcare centres on Sulawesi Island. Participants were asked to complete the two versions of the EQ-5D instruments. The 3 L and 5 L were compared in terms of distribution and ceiling, discriminative power and test-retest reliability. To determine the consistency of the participants' answers, we checked the redistribution pattern, i.e., the consistency of a participant's scores in both versions. Results: A total of 198 T2DM outpatients (mean age 59.90 ± 11.06) completed the 3 L and 5 L surveys. A total of 46 health states for 3 L and 90 health states for 5 L were reported. The '11121' health state was reported most often: 17% in the 3 L and 13% in the 5 L. The results suggested a lower ceiling effect for 5 L (11%) than for 3 L (15%). Regarding redistribution, only 6.1% of responses were found to be inconsistent in this study. The 5 L had higher discriminative power than the 3 L version. Reliability as reflected by the index score was 0.64 for 3 L and 0.74 for 5 L. Pain/discomfort was the dimension mostly affected, whereas the self-care dimension was the least affected. Conclusions: This study suggests that the 5 L-version of the EQ-5D instrument performs better than the 3 L-version in T2DM outpatients in Indonesia, regarding measurement and scoring properties. As such, our study supports the use of the 5 L as the preferred health-related quality of life measurement tool. We did not do a trial but this study was approved by the Medical Ethics Committee of Universitas Gadjah Mada Yogyakarta, Indonesia (document number KE/FK/1188/EC, 12 November 2014, amended 16 March 2015).
... Responsiveness or sensitivity to change is the scale's ability to detect changes within the same patient over time. Floor and ceiling effects refer to the percentages of participants scoring the minimum (floor) and maximum (ceiling) possible scores for HRQoL measures [50]. ...
Article
Full-text available
ABSTRACT Introduction: Patients with systemic lupus erythematosus (SLE) have a better survival than decades ago; nevertheless, they still experience a low health-related (HR) quality of life (QoL). Areas covered: After defining QoL and HRQoL, we review the need to assess it, its elements, how to measure it, its predictors, and its impact and potential interventions to improve it. Expert commentary: Physicians assessments of disease activity and damage do not capture the patients’ perspective of their health, and these differences could lead to nonadherence to therapy. Based on that, a comprehensive evaluation of SLE should include the assessment of HRQoL or the sum of the physical, psychological, and social perception of wellbeing, influenced by the patient’s illness. The most consistent predictors of low HRQoL are older age, poverty, lower educational level, behavioral issues, some clinical manifestations, and comorbidities. HRQoL impacts negatively on dealing with stress, intimal relationship, home and job-related activities, and treatment adherence. At the present, there are no successful specific therapeutic strategies aimed at improving it.
... Responsiveness or sensitivity to change is the scale's ability to detect changes within the same patient over time. Floor and ceiling effects refer to the percentages of participants scoring the minimum (floor) and maximum (ceiling) possible scores for HRQoL measures [50]. ...
Article
Introduction: Patients with systemic lupus erythematosus (SLE) have a better survival than decades ago; nevertheless, they still experience a low health-related (HR) quality of life (QoL). Areas covered: After defining QoL and HRQoL we review the need to assess it, its elements, how to measure it, its predictors and its impact and potential interventions to improve it. Expert commentary: Physicians assessments of disease activity and damage do not capture the patients’ perspective of their health, and these differences could lead to nonadherence to therapy. Based on that, a comprehensive evaluation of SLE should include the assessment of HRQoL or the sum of the physical, psychological and social perception of wellbeing, influenced by the patient’s illness. The most consistent predictors of low HRQoL are older age, poverty, lower educational level, behavioral issues, some clinical manifestations and comorbidities. HRQoL impacts negatively on dealing with stress, intimal relationship, home and job-related activities and treatment adherence. At the present, there are no successful specific therapeutic strategies aimed at improving it.
... The instrument was considered to have floor or ceiling effects if >15% of the respondents scored at the extreme ends of the scale. 7 Construct validity was determined using convergent and discriminant validity and known-group validity. Convergent validity was judged to be adequately demonstrated if there were high (>0.6) ...
Article
Full-text available
Inclusion of patient-reported outcomes is important in SLE clinical trials as they allow capture of the benefits of a proposed intervention in areas deemed pertinent by patients. We aimed to compare the measurement properties of health-related quality of life (HRQoL) measures used in adults with SLE and to evaluate their responsiveness to interventions in randomised controlled trials (RCTs). A systematic review was undertaken using full original papers in English identified from three databases: MEDLINE, EMBASE and PubMed. Studies describing the validation of HRQoL measures in English-speaking adult patients with SLE and SLE drug RCTs that used an HRQoL measure were retrieved. Twenty-five validation papers and 26 RCTs were included in the indepth review evaluating the measurement properties of 4 generic (Medical Outcomes Study Short-Form 36 (SF36), Patient Reported Outcomes Measurement Information System (PROMIS) item-bank, EuroQol-5D, and Functional Assessment of Chronic Illness Therapy-Fatigue) and 3 disease-specific (Lupus Quality of Life (LupusQoL), Lupus Patient Reported Outcomes, Lupus Impact Tracker (LIT)) instruments. All measures had good convergent and discriminant validity. PROMIS provided the strongest evidence for known-group validity and reliability among generic instruments; however, data on its responsiveness have not been published. Across measures, standardised response means were generally indicative of poor-moderate sensitivity to longitudinal change. In RCTs, clinically important improvements were reported in SF36 scores from baseline; however, between-arm differences were frequently non-significant and non-important. SF36, PROMIS, LupusQoL and LIT had the strongest evidence for acceptable measurement properties, but few measures aside from the SF36 have been incorporated into clinical trials. This review highlights the importance of incorporating a broader range of SLE-specific HRQoL measures in RCTs and warrants further research that focuses on longitudinal responsiveness of newer instruments.
Chapter
With the improving life expectancy in systemic lupus erythematosus (SLE), the assessment of health-related quality of life (HRQoL) has become an important outcome measure in these patients. SLE is an autoimmune multisystem disease that tends to affect women of childbearing age and has a significant impact on the HRQoL comparable to that of more common rheumatological and medical illnesses. Assessing HRQoL alongside disease activity and damage captures the patients’ perspective of the impact of the disease and its treatment on their lives. Generic questionnaires, in particular, the SF-36, are the most common questionnaires for assessing HRQoL in SLE. However, in the last 15 years, SLE-specific questionnaires have been developed and validated, and they have the added advantage of assessing issues more relevant to patients with SLE. They are more sensitive to the changes in HRQoL and therefore more useful when assessing patient-reported outcomes in clinical trials of new interventions for SLE patients. This chapter describes the content of generic and SLE-specific measures and presents evidence for the quality of the scale development and psychometric evaluation of the SLE-specific HRQoL scales.
Article
Objectives Systemic lupus erythematosus (SLE) is a common systemic autoimmune disease that may lead to considerable physical, psychological, and socioeconomical burden. In previous studies, inconsistent results were reported for the association of disease activity and organ damage with health-related quality of life (HRQoL). This paper aimed to explore the relationship between disease activity, organ damage, and HRQoL measured by SF-36, EQ-5D, LupusQoL, and LupusPRO and investigate whether the correlation is region-specific. Methods We systematically searched for studies reporting the association between SLE disease activity, organ damage, and HRQoL in MEDLINE, EMBASE, PsycINFO, World of Science, the Cochrane Library, and CINAHL from inception to December 2019. A meta-analysis and region subgroup analysis were performed with a random-effects model to estimate pooled correlation coefficients and heterogeneity. Results Forty articles were included representing of 6079 adult SLE patients. The meta-analysis of SF-36 and LupusPRO studies revealed mild to moderate negative correlations between disease activity and domains of these HRQoL measurements (correlation coefficient r ranging from −0.27 to −0.07). Likewise, negative correlations were found between organ damage and domains of SF-36 and LupusPRO (r ranging from −0.25 to −0.08). The pooled correlation coefficient is relatively higher in physical functioning related domains than mental health. In the region subgroup analysis, disease activity had strong negative correlations with SF-36 domains in African and European SLE patients, while organ damage had the strongest negative correlation with SF-36 domains in Asian SLE patients (p < 0.010). Conclusion This study provides the first comprehensive assessment of the relationship between disease activity, organ damage, and four popular HRQoL instruments, which provides useful insight into the target therapy in SLE management.
Article
Objective To investigate the effect of cosmetic camouflage in health-related quality of life (HRQoL) in women with systemic lupus erythematosus (SLE) and permanent facial skin damage. Methods This is a randomized controlled clinical trial (Universal Trial Number: U1111-1210-2554e) with SLE women from outpatients using ACR/1997 and/or SLICC/2012 criteria, aged over 18 years old, with modified SLEDAI 2k < 4 and permanent facial skin damage, recruited in two tertiary centers to use cosmetic camouflage (n = 36) or no intervention (n = 20). Endpoints were score variations in SLE Quality of Life (SLEQoL) (total and each domain), Dermatology Life Quality Index (DLQI), Rosenberg self-esteem scale and Hospital Anxiety and Depression Scale (HADS), after daily use of cosmetic camouflage for 12 +/−2 weeks (Phase I), “as needed” use of cosmetic camouflage for another 12 +/−2 weeks (Phase II), and during total follow up (24 +/−2 weeks). Univariate and multivariate linear regressions were conducted by protocol analysis. Results Both groups were similar at baseline regarding age, disease duration, socio-demographic, clinical, laboratory and treatment characteristics. The comparison of score variations between intervention and control groups showed an independent HRQoL improvement in total SLEQoL score after using cosmetic camouflage in Phase I [β −27.56 (CI 95% −47.86 to −7.27) p = 0.009] and total follow up [β −28.04 (CI 95% −48.65 to −7.44) p = 0.09], specifically in mood, self-image and physical functioning domains. Also, there was an improvement in DLQI scores during Phase I [β −7.65 (CI 95% −12.31 to −3.00) p = 0.002] and total follow up [β −8.97(CI95% −12.99 to −4.94) p < 0.001). Scores for depression [β −1.92 (CI 95% −3.67 to −0.16) p = 0.033], anxiety [β −2.87 (CI 95% −5.67 to −0.07] p = 0.045] and self-esteem [β 2.79 (CI 95% 0.13 to 5.46) p = 0.041] improved considering the total follow up. No significant changes occurred in the control group scores. Conclusion The use of cosmetic camouflage improved the HRQoL in female SLE patients with permanent facial skin damage.
Article
Full-text available
The COSMIN checklist is a standardized tool for assessing the methodological quality of studies on measurement properties. It contains 9 boxes, each dealing with one measurement property, with 5-18 items per box about design aspects and statistical methods. Our aim was to develop a scoring system for the COSMIN checklist to calculate quality scores per measurement property when using the checklist in systematic reviews of measurement properties. The scoring system was developed based on discussions among experts and testing of the scoring system on 46 articles from a systematic review. Four response options were defined for each COSMIN item (excellent, good, fair, and poor). A quality score per measurement property is obtained by taking the lowest rating of any item in a box ("worst score counts"). Specific criteria for excellent, good, fair, and poor quality for each COSMIN item are described. In defining the criteria, the "worst score counts" algorithm was taken into consideration. This means that only fatal flaws were defined as poor quality. The scores of the 46 articles show how the scoring system can be used to provide an overview of the methodological quality of studies included in a systematic review of measurement properties. Based on experience in testing this scoring system on 46 articles, the COSMIN checklist with the proposed scoring system seems to be a useful tool for assessing the methodological quality of studies included in systematic reviews of measurement properties.
Article
Full-text available
Translate into Brazilian Portuguese, cross cultural adaptation and assess the reliability and validity of the Systemic Lupus Erythematosus Quality of Life Questionnaire (SLEQOL). 107 SLE patients, answered the SLEQOL questionnaire. TRANSLATION: into Portuguese and cross-cultural adaptation was performed in accordance with studies on questionnaire translation methodology into other languages. RELIABILITY: Was analyzed using three interviews with different interviewers, two on the same day (interobserver) and the third within 14 days of the first assessment (intraobserver). Validity was assessed by correlating clinical and quality of life parameters with the SLEQOL. A descriptive analysis of the study sample. Reproducibility was assessed using an intraclass correlation coefficient (ICC). Internal consistency was assessed using Cronbach's alpha coefficient. To assess validity we used Pearson's correlation coefficient. Five percent was the level of significance adopted for all statistical tests. The SLEQOL was translated and culturally adapted. The main findings were: a 0.807 internal consistency correlation coefficient for all questions and domains. The inter and intraobserver correlation coefficients were 0.990 and 0.969 respectively. Validation showed good correlation with theSF-36 and poor correlation with lupus activity or damage indices. The quality of life parameter has been increasingly taken into account for chronic diseases. To date there are no tools to assess Quality of Life in Systemic Lupus Erythematosus (SLE) written in the Portuguese language. The questionnaire is valid and reliable for SLE patients in Brazil.
Article
Full-text available
The COSMIN checklist (COnsensus-based Standards for the selection of health status Measurement INstruments) was developed in an international Delphi study to evaluate the methodological quality of studies on measurement properties of health-related patient reported outcomes (HR-PROs). In this paper, we explain our choices for the design requirements and preferred statistical methods for which no evidence is available in the literature or on which the Delphi panel members had substantial discussion. The issues described in this paper are a reflection of the Delphi process in which 43 panel members participated. The topics discussed are internal consistency (relevance for reflective and formative models, and distinction with unidimensionality), content validity (judging relevance and comprehensiveness), hypotheses testing as an aspect of construct validity (specificity of hypotheses), criterion validity (relevance for PROs), and responsiveness (concept and relation to validity, and (in) appropriate measures). We expect that this paper will contribute to a better understanding of the rationale behind the items, thereby enhancing the acceptance and use of the COSMIN checklist.
Article
Full-text available
Aim of the COSMIN study (COnsensus-based Standards for the selection of health status Measurement INstruments) was to develop a consensus-based checklist to evaluate the methodological quality of studies on measurement properties. We present the COSMIN checklist and the agreement of the panel on the items of the checklist. A four-round Delphi study was performed with international experts (psychologists, epidemiologists, statisticians and clinicians). Of the 91 invited experts, 57 agreed to participate (63%). Panel members were asked to rate their (dis)agreement with each proposal on a five-point scale. Consensus was considered to be reached when at least 67% of the panel members indicated 'agree' or 'strongly agree'. Consensus was reached on the inclusion of the following measurement properties: internal consistency, reliability, measurement error, content validity (including face validity), construct validity (including structural validity, hypotheses testing and cross-cultural validity), criterion validity, responsiveness, and interpretability. The latter was not considered a measurement property. The panel also reached consensus on how these properties should be assessed. The resulting COSMIN checklist could be useful when selecting a measurement instrument, peer-reviewing a manuscript, designing or reporting a study on measurement properties, or for educational purposes.
Article
Full-text available
The Core Outcome Measures Index (COMI) is a reliable and valid instrument for assessing multidimensional outcome in spine surgery. The minimal clinically important score-difference (MCID) for improvement (MCID(imp)) was determined in one of the original research studies validating the instrument, but has never been confirmed in routine clinical practice. Further, the MCID for deterioration (MCID(det)) has never been investigated; indeed, this needs very large sample sizes to obtain sufficient cases with worsening. This study examined the MCIDs of the COMI in routine clinical practice. All patients undergoing surgery in our Spine Center since February 2004 were asked to complete the COMI before and 12 months after surgery. The COMI has one question each on back (neck) pain intensity, leg/buttock (arm/shoulder) pain intensity, function, symptom-specific well-being, general quality of life, work disability, and social disability, scored as a 0-10 index. At follow-up, patients also rated the global effectiveness of surgery, on a 5-point Likert scale. This was used as the external criterion ("anchor") in receiver operating characteristics (ROC) analyses to derive cut-off scores for individual improvement and deterioration. Twelve-month follow-up questionnaires were returned by 3,056 (92%) patients. The group mean COMI score change for patients declaring that the "operation helped" was a reduction of 3.1 points; the corresponding value for those whom it "did not help" was a reduction of 0.5 points. The group MCID(imp) was hence 2.6 points reduction; the corresponding group MCID(det) was 1.2 points increase (0.5 minus -0.7). The area under the ROC curve was 0.88 for MCID(imp) and 0.89 for MCID(det) (both P < 0.0001), indicating that the COMI had good discriminative ability. The cut-offs for individual improvement and deterioration, respectively, were > or =2.2 points decrease (sensitivity 81%, specificity 83%) and > or =0.3 points increase (sensitivity 83%, specificity 88%). The MCID(imp) score of 2.2 points was similar to that reported in the original study (2-3 points, depending on external criterion used). The MCID(det) suggested that the COMI is less responsive to deterioration than to improvement, a phenomenon also reported for other spine outcome instruments. This needs further investigation in even larger patient groups. The MCIDs provide essential information for both the planning (sample size) and interpretation of the results (clinical relevance) of future clinical studies using the COMI.
Article
Health related quality of life (HRQOL) is an important patient-reported outcome in systemic lupus erythematosus (SLE). We evaluated the psychometric properties of 2 widely used preference-based generic HRQOL measures, EuroQol-5D (EQ-5D) and Short Form-6D (SF-6D), among United States patients with SLE. Patients with SLE enrolled at an academic institution were assessed for self-reported generic HRQOL (EQ-5D, Medical Outcomes Study SF-36), disease activity, and disease damage SF-6D. Physical Component Score (PCS) and Mental Component Score (MCS) were calculated from SF-36. Criterion validity, convergent validity, and known-groups comparisons were evaluated for EQ-5D and SF-6D. Sensitivity to change (t tests, effect size) was evaluated in a subset of the cohort followed longitudinally. One hundred sixty-seven patients with SLE were enrolled. Related domains on the EQ-5D and SF-36 correlated strongly, e.g., mobility and physical functioning (r=0.60), whereas unrelated domains showed weak to moderate correlation. EQ-5D index, EQ-5D visual analog scale, and SF-6D score correlated strongly among each other as well as with most domains of SF-36. Both EQ-5D and SF-6D indices differentiated among patients of varied disease severity. EQ-5D and SF-6D were found to be sensitive to self-reported change in health but insensitive to change in disease activity longitudinally. Disease activity and damage showed weak correlation with HRQOL measures. The SF-6D and EQ-5D exhibited satisfactory psychometric properties for use among US patients with SLE. Measures of disease activity and damage were weakly correlated with HRQOL, suggesting that HRQOL is an important complementary source of information about patients with SLE.