Linking Inpatient Clinical Registry Data to Medicare Claims Data Using Indirect Identifiers

Duke Clinical Research Institute, Duke University School of Medicine, Durham, NC 27715, USA.
American heart journal (Impact Factor: 4.46). 07/2009; 157(6):995-1000. DOI: 10.1016/j.ahj.2009.04.002
Source: PubMed


Inpatient clinical registries generally have limited ability to provide a longitudinal perspective on care beyond the acute episode. We present a method to link hospitalization records from registries with Medicare inpatient claims data, without using direct identifiers, to create a unique data source that pairs rich clinical data with long-term outcome data.
The method takes advantage of the hospital clustering observed in each database by demonstrating that different combinations of indirect identifiers within hospitals yield a large proportion of unique patient records. This high level of uniqueness also allows linking without advance knowledge of the Medicare provider number of each registry hospital. We applied this method to 2 inpatient databases and were able to identify 81% of 39,178 records in a large clinical registry of patients with heart failure and 91% of 6,581 heart failure records from a hospital inpatient database. The quality of the link is high, and reasons for incomplete linkage are explored. Finally, we discuss the unique opportunities afforded by combining claims and clinical data for specific analyses.
In the absence of direct identifiers, it is possible to create a high-quality link between inpatient clinical registry data and Medicare claims data. The method will allow researchers to use existing data to create a linked claims-clinical database that capitalizes on the strengths of both types of data sources.

Download full-text


Available from: Bradley G Hammill,
  • Source
    • "Linking data sets in community-based programs could be challenging, because limited identifiers are collected to encourage participation and protect participant privacy . Yet, linking data sets without a single unique identifier can be accomplished as long as a combination of variables creates some level of uniqueness (Hammill et al., 2009; Lawson et al., 2012). For example, millions of people share the same date of birth, thousands of people share the same first name, and hundreds to thousands of people live in the same ZIP code. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In community-based wellness programs, Social Security Numbers (SSNs) are rarely collected to encourage participation and protect participant privacy. One measure of program effectiveness includes changes in health care utilization. For the 65 and over population, health care utilization is captured in Medicare administrative claims data. Therefore, methods as described in this article for linking participant information to administrative data are useful for program evaluations where unique identifiers such as SSN are not available. Following fuzzy matching methodologies, participant information from the National Study of the Chronic Disease Self-Management Program was linked to Medicare administrative data. Linking variables included participant name, date of birth, gender, address, and ZIP code. Seventy-eight percent of participants were linked to their Medicare claims data. Linking program participant information to Medicare administrative data where unique identifiers are not available provides researchers with the ability to leverage claims data to better understand program effects.
    Evaluation &amp the Health Professions 08/2014; DOI:10.1177/0163278714547568 · 1.91 Impact Factor
  • Source
    • "Certain countries may have national donor registries that could be cross-linked to other databases. However, some national registries do not collect or distribute patient identifiers that are needed to link clinical or claims data [20]. An advantage of using administrative data is that it will provide similar information on both those who did and did not become donors (the latter is not recorded in donor registries). "
    [Show abstract] [Hide abstract]
    ABSTRACT: We evaluated the validity of physician billing claims to identify deceased organ donors in large provincial healthcare databases. We conducted a population-based retrospective validation study of all deceased donors in Ontario, Canada from 2006 to 2011 (n = 988). We included all registered deaths during the same period (n = 458,074). Our main outcome measures included sensitivity, specificity, positive predictive value, and negative predictive value of various algorithms consisting of physician billing claims to identify deceased organ donors and organ-specific donors compared to a reference standard of medical chart abstraction. The best performing algorithm consisted of any one of 10 different physician billing claims. This algorithm had a sensitivity of 75.4% (95% CI: 72.6% to 78.0%) and a positive predictive value of 77.4% (95% CI: 74.7% to 80.0%) for the identification of deceased organ donors. As expected, specificity and negative predictive value were near 100%. The number of organ donors identified by the algorithm each year was similar to the expected value, and this included the pre-validation period (1991 to 2005). Algorithms to identify organ-specific donors performed poorly (e.g. sensitivity ranged from 0% for small intestine to 67% for heart; positive predictive values ranged from 0% for small intestine to 37% for heart). Primary data abstraction to identify deceased organ donors should be used whenever possible, particularly for the detection of organ-specific donations. The limitations of physician billing claims should be considered whenever they are used.
    PLoS ONE 08/2013; 8(8):e70825. DOI:10.1371/journal.pone.0070825 · 3.23 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This summary of the last session of the First Neurocritical Care Research Conference reviews the discussions about research priorities in neurocritical care. The first presentation reviewed current projects funded by the National Institute of Neurological Disorders and Stroke at the National Institutes of Health and potential models to follow including an independent Neurocritical Care Network or the creation of such a network with the goal of collaborating with already existing ones. Experienced neurointensivists then presented their views on the most common and important research questions that need to be answered and investigated in the field. Finally, utility of clinical registries was discussed emphasizing their importance as hypothesis generators. During the group discussion, interests in comparative effectiveness research, the use of physiological endpoints from monitoring and alternate trial design were expressed.
    Neurocritical Care 02/2012; 16(1-1):35-41. DOI:10.1007/s12028-011-9611-y · 2.44 Impact Factor
Show more