Shawn N Murphy

Partners HealthCare, Boston, MA, USA

Are you Shawn N Murphy?

Claim your profile

Publications (31)51.97 Total impact

  • Article: Improving Case Definition of Crohn's Disease and Ulcerative Colitis in Electronic Medical Records Using Natural Language Processing: A Novel Informatics Approach.
    [show abstract] [hide abstract]
    ABSTRACT: BACKGROUND:: Previous studies identifying patients with inflammatory bowel disease using administrative codes have yielded inconsistent results. Our objective was to develop a robust electronic medical record-based model for classification of inflammatory bowel disease leveraging the combination of codified data and information from clinical text notes using natural language processing. METHODS:: Using the electronic medical records of 2 large academic centers, we created data marts for Crohn's disease (CD) and ulcerative colitis (UC) comprising patients with ≥1 International Classification of Diseases, 9th edition, code for each disease. We used codified (i.e., International Classification of Diseases, 9th edition codes, electronic prescriptions) and narrative data from clinical notes to develop our classification model. Model development and validation was performed in a training set of 600 randomly selected patients for each disease with medical record review as the gold standard. Logistic regression with the adaptive LASSO penalty was used to select informative variables. RESULTS:: We confirmed 399 CD cases (67%) in the CD training set and 378 UC cases (63%) in the UC training set. For both, a combined model including narrative and codified data had better accuracy (area under the curve for CD 0.95; UC 0.94) than models using only disease International Classification of Diseases, 9th edition codes (area under the curve 0.89 for CD; 0.86 for UC). Addition of natural language processing narrative terms to our final model resulted in classification of 6% to 12% more subjects with the same accuracy. CONCLUSIONS:: Inclusion of narrative concepts identified using natural language processing improves the accuracy of electronic medical records case definition for CD and UC while simultaneously identifying more subjects compared with models using codified data alone.
    Inflammatory Bowel Diseases 04/2013; · 4.86 Impact Factor
  • Article: Similar Risk of Depression and Anxiety Following Surgery or Hospitalization for Crohn's Disease and Ulcerative Colitis.
    [show abstract] [hide abstract]
    ABSTRACT: OBJECTIVES:Psychiatric comorbidity is common in Crohn's disease (CD) and ulcerative colitis (UC). Inflammatory bowel disease (IBD)-related surgery or hospitalizations represent major events in the natural history of the disease. The objective of this study is to examine whether there is a difference in the risk of psychiatric comorbidity following surgery in CD and UC.METHODS:We used a multi-institution cohort of IBD patients without a diagnosis code for anxiety or depression preceding their IBD-related surgery or hospitalization. Demographic-, disease-, and treatment-related variables were retrieved. Multivariate logistic regression analysis was performed to individually identify risk factors for depression and anxiety.RESULTS:Our study included a total of 707 CD and 530 UC patients who underwent bowel resection surgery and did not have depression before surgery. The risk of depression 5 years after surgery was 16% and 11% in CD and UC patients, respectively. We found no difference in the risk of depression following surgery in the CD and UC patients (adjusted odds ratio, 1.11; 95% confidence interval, 0.84-1.47). Female gender, comorbidity, immunosuppressant use, perianal disease, stoma surgery, and early surgery within 3 years of care predicted depression after CD surgery; only the female gender and comorbidity predicted depression in UC patients. Only 12% of the CD cohort had ≥4 risk factors for depression, but among them nearly 44% subsequently received a diagnosis code for depression.CONCLUSIONS:IBD-related surgery or hospitalization is associated with a significant risk for depression and anxiety, with a similar magnitude of risk in both diseases.Am J Gastroenterol advance online publication, 22 January 2013; doi:10.1038/ajg.2012.471.
    The American Journal of Gastroenterology 01/2013; · 7.28 Impact Factor
  • Article: SHRINE: Enabling Nationally Scalable Multi-Site Disease Studies.
    [show abstract] [hide abstract]
    ABSTRACT: Results of medical research studies are often contradictory or cannot be reproduced. One reason is that there may not be enough patient subjects available for observation for a long enough time period. Another reason is that patient populations may vary considerably with respect to geographic and demographic boundaries thus limiting how broadly the results apply. Even when similar patient populations are pooled together from multiple locations, differences in medical treatment and record systems can limit which outcome measures can be commonly analyzed. In total, these differences in medical research settings can lead to differing conclusions or can even prevent some studies from starting. We thus sought to create a patient research system that could aggregate as many patient observations as possible from a large number of hospitals in a uniform way. We call this system the 'Shared Health Research Information Network', with the following properties: (1) reuse electronic health data from everyday clinical care for research purposes, (2) respect patient privacy and hospital autonomy, (3) aggregate patient populations across many hospitals to achieve statistically significant sample sizes that can be validated independently of a single research setting, (4) harmonize the observation facts recorded at each institution such that queries can be made across many hospitals in parallel, (5) scale to regional and national collaborations. The purpose of this report is to provide open source software for multi-site clinical studies and to report on early uses of this application. At this time SHRINE implementations have been used for multi-site studies of autism co-morbidity, juvenile idiopathic arthritis, peripartum cardiomyopathy, colorectal cancer, diabetes, and others. The wide range of study objectives and growing adoption suggest that SHRINE may be applicable beyond the research uses and participating hospitals named in this report.
    PLoS ONE 01/2013; 8(3):e55811. · 4.09 Impact Factor
  • Article: Current state of information technologies for the clinical research enterprise across academic medical centers.
    [show abstract] [hide abstract]
    ABSTRACT: Information technology (IT) to support clinical research has steadily grown over the past 10 years. Many new applications at the enterprise level are available to assist with the numerous tasks necessary in performing clinical research. However, it is not clear how rapidly this technology is being adopted or whether it is making an impact upon how clinical research is being performed. The Clinical Research Forum's IT Roundtable performed a survey of 17 representative academic medical centers (AMCs) to understand the adoption rate and implementation strategies within this field. The results were compared with similar surveys from 4 and 6 years ago. We found the adoption rate for four prominent areas of IT-supported clinical research had increased remarkably, specifically regulatory compliance, electronic data capture for clinical trials, data repositories for secondary use of clinical data, and infrastructure for supporting collaboration. Adoption of other areas of clinical research IT was more irregular with wider differences between AMCs. These differences appeared to be partially due to a set of openly available applications that have emerged to occupy an important place in the landscape of clinical research enterprise-level support at AMC's.
    Clinical and Translational Science 06/2012; 5(3):281-4. · 1.13 Impact Factor
  • Article: Apps to display patient data, making SMART available in the i2b2 platform.
    [show abstract] [hide abstract]
    ABSTRACT: The Substitutable Medical Apps, Reusable Technologies (SMART) project provides a framework of core services to facilitate the use of substitutable health-related web applications. The platform offers a common interface used to "SMART-ready" health IT systems allowing any SMART application to be able to interact with those systems. At Partners Healthcare, we have SMART-enabled the Informatics for Integrating Biology and the Bedside (i2b2) open source analytical platform, enabling the use of SMART applications directly within the i2b2 web client. In i2b2, viewing the patient in an EMR-like view enables a natural-feeling medical review process for each patient.
    AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 01/2012; 2012:960-9.
  • Article: Strategies for maintaining patient privacy in i2b2.
    [show abstract] [hide abstract]
    ABSTRACT: The re-use of patient data from electronic healthcare record systems can provide tremendous benefits for clinical research, but measures to protect patient privacy while utilizing these records have many challenges. Some of these challenges arise from a misperception that the problem should be solved technically when actually the problem needs a holistic solution. The authors' experience with informatics for integrating biology and the bedside (i2b2) use cases indicates that the privacy of the patient should be considered on three fronts: technical de-identification of the data, trust in the researcher and the research, and the security of the underlying technical platforms. The security structure of i2b2 is implemented based on consideration of all three fronts. It has been supported with several use cases across the USA, resulting in five privacy categories of users that serve to protect the data while supporting the use cases. The i2b2 architecture is designed to provide consistency and faithfully implement these user privacy categories. These privacy categories help reflect the policy of both the Health Insurance Portability and Accountability Act and the provisions of the National Research Act of 1974, as embodied by current institutional review boards. By implementing a holistic approach to patient privacy solutions, i2b2 is able to help close the gap between principle and practice.
    Journal of the American Medical Informatics Association 12/2011; 18 Suppl 1:i103-8. · 3.61 Impact Factor
  • Article: A translational engine at the national scale: informatics for integrating biology and the bedside.
    [show abstract] [hide abstract]
    ABSTRACT: Informatics for integrating biology and the bedside (i2b2) seeks to provide the instrumentation for using the informational by-products of health care and the biological materials accumulated through the delivery of health care to conduct discovery research and to study the healthcare system in vivo. This complements existing efforts such as prospective cohort studies or trials outside the delivery of routine health care. i2b2 has been used to generate genome-wide studies at less than one tenth the cost and one tenth the time of conventionally performed studies as well as to identify important risk from commonly used medications. i2b2 has been adopted by over 60 academic health centers internationally.
    Journal of the American Medical Informatics Association 11/2011; 19(2):181-5. · 3.61 Impact Factor
  • Article: Genetic basis of autoantibody positive and negative rheumatoid arthritis risk in a multi-ethnic cohort derived from electronic health records.
    [show abstract] [hide abstract]
    ABSTRACT: Discovering and following up on genetic associations with complex phenotypes require large patient cohorts. This is particularly true for patient cohorts of diverse ancestry and clinically relevant subsets of disease. The ability to mine the electronic health records (EHRs) of patients followed as part of routine clinical care provides a potential opportunity to efficiently identify affected cases and unaffected controls for appropriate-sized genetic studies. Here, we demonstrate proof-of-concept that it is possible to use EHR data linked with biospecimens to establish a multi-ethnic case-control cohort for genetic research of a complex disease, rheumatoid arthritis (RA). In 1,515 EHR-derived RA cases and 1,480 controls matched for both genetic ancestry and disease-specific autoantibodies (anti-citrullinated protein antibodies [ACPA]), we demonstrate that the odds ratios and aggregate genetic risk score (GRS) of known RA risk alleles measured in individuals of European ancestry within our EHR cohort are nearly identical to those derived from a genome-wide association study (GWAS) of 5,539 autoantibody-positive RA cases and 20,169 controls. We extend this approach to other ethnic groups and identify a large overlap in the GRS among individuals of European, African, East Asian, and Hispanic ancestry. We also demonstrate that the distribution of a GRS based on 28 non-HLA risk alleles in ACPA+ cases partially overlaps with ACPA- subgroup of RA cases. Our study demonstrates that the genetic basis of rheumatoid arthritis risk is similar among cases of diverse ancestry divided into subsets based on ACPA status and emphasizes the utility of linking EHR clinical data with biospecimens for genetic studies.
    The American Journal of Human Genetics 01/2011; 88(1):57-69. · 10.60 Impact Factor
  • Article: Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2).
    [show abstract] [hide abstract]
    ABSTRACT: Informatics for Integrating Biology and the Bedside (i2b2) is one of seven projects sponsored by the NIH Roadmap National Centers for Biomedical Computing (http://www.ncbcs.org). Its mission is to provide clinical investigators with the tools necessary to integrate medical record and clinical research data in the genomics age, a software suite to construct and integrate the modern clinical research chart. i2b2 software may be used by an enterprise's research community to find sets of interesting patients from electronic patient medical record data, while preserving patient privacy through a query tool interface. Project-specific mini-databases ("data marts") can be created from these sets to make highly detailed data available on these specific patients to the investigators on the i2b2 platform, as reviewed and restricted by the Institutional Review Board. The current version of this software has been released into the public domain and is available at the URL: http://www.i2b2.org/software.
    Journal of the American Medical Informatics Association 03/2010; 17(2):124-30. · 3.61 Impact Factor
  • Article: I am Not Dead Yet: Identification of False-Positive Matches to Death Master File.
    Alexander Turchin, Maria Shubina, Shawn N Murphy
    [show abstract] [hide abstract]
    ABSTRACT: Patient death is an important clinical outcome. It is typically ascertained by matching database records with external death indices. Accuracy of the matching algorithms is imperfect.We have investigated whether clinical records made > 1 month after the date of death accurately identify false positive matches to the Death Master File. Positive predictive value (PPV) varied from 74.7% (notes) to 95.9% (labs) and sensitivity from 57.4% (adverse medication reactions) to 94.9% (notes). Presence of any two out of four (billing data, labs, vital signs and medications) data elements had sensitivity of 83.0% and PPV of 98.3%. Area under the ROC curve for a multivariable logistic model that included the number of these four data elements recorded > 1 month after death was 0.987.Clinical data recorded after the date of death can help identify false positive matches to death indices and could be utilized to improve existing record linkage algorithms.
    AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 01/2010; 2010:807-11.
  • Source
    Article: Rapid identification of myocardial infarction risk associated with diabetes medications using electronic medical records.
    [show abstract] [hide abstract]
    ABSTRACT: OBJECTIVE To assess the ability to identify potential association(s) of diabetes medications with myocardial infarction using usual care clinical data obtained from the electronic medical record. RESEARCH DESIGN AND METHODS We defined a retrospective cohort of patients (n = 34,253) treated with a sulfonylurea, metformin, rosiglitazone, or pioglitazone in a single academic health care network. All patients were aged >18 years with at least one prescription for one of the medications between 1 January 2000 and 31 December 2006. The study outcome was acute myocardial infarction requiring hospitalization. We used a cumulative temporal approach to ascertain the calendar date for earliest identifiable risk associated with rosiglitazone compared with that for other therapies. RESULTS Sulfonylurea, metformin, rosiglitazone, or pioglitazone therapy was prescribed for 11,200, 12,490, 1,879, and 806 patients, respectively. A total of 1,343 myocardial infarctions were identified. After adjustment for potential myocardial infarction risk factors, the relative risk for myocardial infarction with rosiglitazone was 1.3 (95% CI 1.1-1.6) compared with sulfonylurea, 2.2 (1.6-3.1) compared with metformin, and 2.2 (1.5-3.4) compared with pioglitazone. Prospective surveillance using these data would have identified increased risk for myocardial infarction with rosiglitazone compared with metformin within 18 months of its introduction with a risk ratio of 2.1 (95% CI 1.2-3.8). CONCLUSIONS Our results are consistent with a relative adverse cardiovascular risk profile for rosiglitazone. Our use of usual care electronic data sources from a large hospital network represents an innovative approach to rapid safety signal detection that may enable more effective postmarketing drug surveillance.
    Diabetes care 12/2009; 33(3):526-31. · 8.09 Impact Factor
  • Article: The Shared Health Research Information Network (SHRINE): a prototype federated query tool for clinical data repositories.
    [show abstract] [hide abstract]
    ABSTRACT: The authors developed a prototype Shared Health Research Information Network (SHRINE) to identify the technical, regulatory, and political challenges of creating a federated query tool for clinical data repositories. Separate Institutional Review Boards (IRBs) at Harvard's three largest affiliated health centers approved use of their data, and the Harvard Medical School IRB approved building a Query Aggregator Interface that can simultaneously send queries to each hospital and display aggregate counts of the number of matching patients. Our experience creating three local repositories using the open source Informatics for Integrating Biology and the Bedside (i2b2) platform can be used as a road map for other institutions. The authors are actively working with the IRBs and regulatory groups to develop procedures that will ultimately allow investigators to obtain identified patient data and biomaterials through SHRINE. This will guide us in creating a future technical architecture that is scalable to a national level, compliant with ethical guidelines, and protective of the interests of the participating hospitals.
    Journal of the American Medical Informatics Association 07/2009; 16(5):624-30. · 3.61 Impact Factor
  • Article: Application of Information Technology: The Shared Health Research Information Network (SHRINE): A Prototype Federated Query Tool for Clinical Data Repositories.
    JAMIA. 01/2009; 16:624-630.
  • Source
    Conference Proceeding: A PSO/ACO approach to knowledge discovery in a pharmacovigilance context.
    Margarita Sordo, Gabriela Ochoa, Shawn N. Murphy
    Genetic and Evolutionary Computation Conference, GECCO 2009, Proceedings, Montreal, Québec, Canada, July 8-12, 2009, Companion Material; 01/2009
  • Article: E-facts: business process management in clinical data repositories.
    [show abstract] [hide abstract]
    ABSTRACT: The Partners Healthcare Research Patient Data Registry (RPDR) is a centralized data repository that gathers clinical data from various hospital systems. The RPDR allows clinical investigators to obtain aggregate numbers of patients with user-defined characteristics such as diagnoses, procedures, medications, and laboratory values. They may then obtain patient identifiers and electronic medical records with prior IRB approval. Moreover, the accurate identification and efficient population of worthwhile and quantifiable facts from doctor's report into the RPDR is a significant process. As part of our ongoing e-Fact project, this work describes a new business process management technology that helps coordinate and simplify this procedure.
    AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 02/2008;
  • Article: Simulated yields of prospective specimen collection from specific patient cohorts using retrospective data from a research patient data repository.
    Anil K Dubey, Vivian Gainer, Shawn N Murphy
    [show abstract] [hide abstract]
    ABSTRACT: Biospecimens obtained during the process of care may be a potential resource for bioinformatics research requiring specimens from specific patient populations. In order to estimate the potential yield of collecting specimens obtained during the process of care, we used a research patient data registry to define specific patient cohorts, and then used these cohorts to screen for chemistry and hematology tests also contained in the registry. We present here our approach and initial results.
    AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 02/2008;
  • Article: STROBE-based methodology for detection of adverse events across multiple communities.
    [show abstract] [hide abstract]
    ABSTRACT: Partners Healthcare is one of five institutions in conjunction with eHealth Initiative (eHI) and the FDA that is collaborating in a nation-wide effort to develop novel health information technology tools to create an active drug safety surveillance system across the U.S. The STROBE statement serves as the standard for the definition of a structured, systematic, reproducible approach for detecting both the risks and benefits of drug treatments in multiple settings.
    AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 02/2008;
  • Article: A study of the age attribute in a query tool for a clinical data warehouse.
    [show abstract] [hide abstract]
    ABSTRACT: The RPDR, a clinical data warehouse with a user-friendly Querytool, allows researchers to perform studies on patient data. Currently, the RPDR represents age as the patient's age at the present time, which is problematic in situations where age at the time of the event is more appropriate. We will modify the Querytool to consider this by assessing the perception of age via survey, testing backend query solutions, and developing modifications based on these results.
    AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 02/2008;
  • Article: Integrating outside modules into the i2b2 architecture.
    [show abstract] [hide abstract]
    ABSTRACT: Informatics for Integrating Biology and the Bedside (i2b2) is one of the sponsored initiatives of the NIH Roadmap National Centers for Biomedical Computing (http://www.bisti.nih.gov/ncbc/). A major goal of i2b2 is to provide clinical investigators broadly with the software tools necessary to collect and manage project-related clinical research data in the genomics age as a cohesive entitya software suite to construct and manage the modern clinical research chart.
    AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 02/2008;
  • Article: Mining for associations between categorical data items in a clinical data repository.
    [show abstract] [hide abstract]
    ABSTRACT: We present here our preliminary work in using simple two-way categorical tests to discover associations between categorical items in a clinical data repository. Initial results using the chi square test yielded diagnosis code associations that seemed plausible as well as several that did not. This may be due in part to the effect of sample size. Tests more resistant to the effects of sample size may yield a higher fraction of plausible diagnosis code associations.
    AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 02/2007;