Representativeness of the Patient-Reported Outcomes Measurement Information System Internet panel

UCLA Department of Medicine, Division of General Internal Medicine & Health Services Research, University of California-Los Angeles, 911 Broxton Plaza, Los Angeles, CA 90095, USA.
Journal of clinical epidemiology (Impact Factor: 3.42). 11/2010; 63(11):1169-78. DOI: 10.1016/j.jclinepi.2009.11.021
Source: PubMed


To evaluate the Patient-Reported Outcomes Measurement Information System (PROMIS), which collected data from an Internet polling panel, and to compare PROMIS with national norms.
We compared demographics and self-rated health of the PROMIS general Internet sample (N=11,796) and one of its subsamples (n=2,196) selected to approximate the joint distribution of demographics from the 2000 U.S. Census, with three national surveys and U.S. Census data. The comparisons were conducted using equivalence testing with weights created for PROMIS by raking.
The weighted PROMIS population and subsample had similar demographics compared with the 2000 U.S. Census, except that the subsample had a higher percentage of people with higher education than high school. Equivalence testing shows similarity between PROMIS general population and national norms with regard to body mass index, EQ-5D health index (EuroQol group defined descriptive system of health-related quality of life states consisting of five dimensions including mobility, self-care, usual activities, pain/discomfort, anxiety/depression), and self-rating of general health.
Self-rated health of the PROMIS general population is similar to that of existing samples from the general U.S. population. The weighted PROMIS general population is more comparable to national norms than the unweighted population with regard to subject characteristics. The findings suggest that the representativeness of the Internet data is comparable to those from probability-based general population samples.

Download full-text


Available from: Richard Gershon
  • Source
    • "There are 6 items in each of the seven domains, with responses ranging from 1 to 5. Raw scores for each domain are calculated by summing the item scores while adjusting for missing item responses, and it can be estimated if at least 4 out of the 6 items in that domain were answered. Raw scores are transformed using the T-score metric based on the item response theory calibrations in which scores have a mean of 50 and a SD of 10 for the general population in the U.S. [22,23]. T-scores can be estimated using the scoring tables listed in the PROMIS manuals [24]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background The Patient Reported Outcomes Measurement Information System 43-item short form (PROMIS-43) and the five-level EQ-5D (EQ-5D-5L) are recently developed measures of health-related quality of life (HRQL) that have potentially broad application in evaluating treatments and capturing burden of respiratory-related diseases. The aims of this study were: (1) to examine their psychometric properties in patients with chronic obstructive pulmonary disease (COPD), and (2) to identify dimensions of HRQL that differ and do not differ by lung function. Methods We conducted a multi-center, cross-sectional study (“COPD Outcomes-based Network for Clinical Effectiveness & Research Translation” [CONCERT]). We analyzed patients who met spirometric criteria for COPD, and completed EQ-5D-5L and PROMIS questionnaires. Disease severity was graded based on the Global Initiative for Chronic Obstructive Lung Disease (GOLD) classification. Pulmonary function test, PROMIS-43, EQ-5D (index score and EQ-Visual Analog Scale [EQ-VAS]), six minute walk test (6MWT), and three dyspnea scales (mMRC, Borg, FACIT-Dyspnea) were administered. Validity and reliability of EQ-5D-5L and PROMIS-43 were examined, and differences in HRQL by GOLD grade were assessed. Results Data from 670 patients with COPD were analyzed (mean age 68.5 years; 58% male). More severe COPD was associated with more problems with mobility, self-care and usual activities (all p-values <0.01) according to EQ-5D-5L. Related domains on EQ-5D-5L, PROMIS and clinical measures were moderately (r = 0.30-0.49) to strongly (r ≥ 0.50) correlated. A statistically significant trend of decreasing HRQL with more severe lung functions was observed for EQ-5D-5L index scores, EQ-VAS scores, and PROMIS physical function and social roles. Conclusions Results supported the validity of EQ-5D-5L and PROMIS-43 in COPD patients, and indicate that physical function and social activities decrease with level of lung function by GOLD grade, but not pain, mental health, sleep or fatigue as reported by patients.
    Full-text · Article · Jun 2014 · BMC Medical Research Methodology
  • Source
    • "At the end of each of the four weeks, these 3 domains were assessed with PROMIS CATs that were set to administer no less than 4 and no more than 12 items and to terminate when N.90 score reliability (SE b 3 T-score points) was achieved. The scores are reported on a T-score metric (mean = 50; standard deviation = 10) that is anchored to the distribution of scores in the U.S. general population [25] [26]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This study examined the ecological validity and clinical utility of NIH Patient Reported-Outcomes Measurement Information System (PROMIS®) instruments for anger, depression, and fatigue in women with premenstrual symptoms. One-hundred women completed daily diaries and weekly PROMIS assessments over 4weeks. Weekly assessments were administered through Computerized Adaptive Testing (CAT). Weekly CATs and corresponding daily scores were compared to evaluate ecological validity. To test clinical utility, we examined if CATs could detect changes in symptom levels, if these changes mirrored those obtained from daily scores, and if CATs could identify clinically meaningful premenstrual symptom change. PROMIS CAT scores were higher in the pre-menstrual than the baseline (ps<.0001) and post-menstrual (ps<.0001) weeks. The correlations between CATs and aggregated daily scores ranged from .73 to .88 supporting ecological validity. Mean CAT scores showed systematic changes in accordance with the menstrual cycle and the magnitudes of the changes were similar to those obtained from the daily scores. Finally, Receiver Operating Characteristic (ROC) analyses demonstrated the ability of the CATs to discriminate between women with and without clinically meaningful premenstrual symptom change. PROMIS CAT instruments for anger, depression, and fatigue demonstrated validity and utility in premenstrual symptom assessment. The results provide encouraging initial evidence of the utility of PROMIS instruments for the measurement of affective premenstrual symptoms.
    Full-text · Article · Apr 2014 · Journal of psychosomatic research
  • Source
    • "PROMIS uses a 7-day recall period for the domains in this study (anger, fatigue, depression). The scores are normed on a T-score metric, which is scaled to have a mean of 50 and a standard deviation of 10 in the U.S. general population [3] [25]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Patient-reported outcome measures with reporting periods of a week or more are often used to evaluate the change of symptoms over time, but the accuracy of recall in the context of change is not well understood. This study examined whether temporal trends in symptoms that occur during the reporting period impact the accuracy of 7-day recall reports. Women with premenstrual symptoms (n=95) completed daily reports of anger, depression, fatigue, and pain intensity for 4weeks, as well as 7-day recall reports at the end of each week. Latent class growth analysis was used to categorize recall periods based on the direction and rate of change in the daily reports. Agreement (level differences and correlations) between 7-day recall and aggregated daily scores was compared for recall periods with different temporal trends. Recall periods with positive, negative, and flat temporal trends were identified and they varied in accordance with weeks of the menstrual cycle. Replicating previous research, 7-day recall scores were consistently higher than aggregated daily scores, but this level difference was more pronounced for recall periods involving positive and negative trends compared with flat trends. Moreover, correlations between 7-day recall and aggregated daily scores were lower in the presence of positive and negative trends compared with flat trends. These findings were largely consistent for anger, depression, fatigue, and pain intensity. Temporal trends in symptoms can influence the accuracy of recall reports and this should be considered in research designs involving change.
    Full-text · Article · Aug 2013 · Journal of psychosomatic research
Show more